Blog 7: Right to Fair Representation
Published on:
The following case study examines misrepresentation of South Asian cultural identities by conducting a focus group with different individuals and their experience using Text-to-Image generation software for cultural representation. It also talks about the implications of such software in perpetuating harmful stereotypes, and its dominantly Western views.
Read The Case Study Here:
AI’s Regimes of Representation: A Community-Centered Study of Text-to-Image Models in South Asia
Examining Text-to-Image tools from a non-Western lens
Regarding the misrepresentation of non-western imagery in text-to-image generation software, I don’t necessarily have a say in what it’s doing wrong because I’m the exact kind of person they’ve been made to represent, but I don’t believe that means I have no part in the discussion as a whole. In fact, this kind of reporting is especially important for people like me because we won’t always immediately know if something is wrong, and that insight is crucial for preventing misinformation and perpetuation of harmful stereotypes among westerners specifically. If these models are to become widely implemented, they need to be appropriately tuned to serve all audiences, which cannot be done with current large-scale benchmark testing. There are far too many nuances to consider that a single person/team wouldn’t be able to adequately address, so focus groups like the one in the case study may be the best for confronting this issue. As is evident, participants were quickly able to find glaring problems with the representation of South Asian cultures that would’ve gone unchecked had they not been tested by individuals knowledgeable on the subject matter. Thus, these tools must be evaluated by a wide range of people.
As for where this technology is headed, I don’t imagine it’s leaving any time soon. My personal take would be to just rid the entirety of generative AI slop but companies seem adamant to force it into every aspect of life with no real way to retaliate, so the least that we can demand of them is to make the technology more inclusive to non-westerners. If the main issue with these models is just that they’re poorly tuned, then I do believe that following the method that the case study did would allow them to identify and fix issues present. However, if the underlying problem is the data that models were trained off of (which I believe to be the more likely explanation), then it would be possible to potentially remove inaccurate portrayals by better labeling data and furthermore training with relevant data sought out by individuals like the ones in the focus group. The trouble with this is that these software will have already been trained on incorrect data, so lots of re-training would be required which I unfortunately can’t envision most companies doing. At the very least, they must rework it to prevent the generation of harmful imagery.
Now, considering that cultural representation is always changing, these companies have another challenge of needing to constantly keep up with that notion. I can’t imagine a way in which such algorithms could be constantly updated to current trends since they would need a constant stream of data from all over the world that can’t feasibly be labeled and tuned in time to keep up. This is the reason that all these models have ‘cutoff dates’, and while some are capable of pulling resources from the internet now, there’s still no telling whether or not they’re getting data that’s appropriate, accurate, and hopefully not generated by another tool. For now, I believe companies should focus on fixing what’s in peoples’ hands now before trying to tackle future iterations, since they already have a lot on their plate in that sense.
Consider the following
Do you think the issues prevalent in these models are a fault of the model’s tuning, the labeling of data, the data itself, or something else? Considering your choice, what steps could companies take to rectify the errors caused by said issue (more accurate data work, focus groups, etc.)?
I explored this question in my main response and I imagine it would be a great way to get people to consider where these models go wrong. I thought about including a sentence on whether or not these models may be purposefully malicious and what companies would have to lose or gain from that, but worried it would be too long and cluttered. As for how I worded it, I wanted the first question to have some explicit answers, then leave them with a more open-ended second question to properly spark discussion.
Final Reflection
This case study felt like an important read for me, since I never really thought about the fact that these generative models are hyper-westernized. It sort of makes sense considering they’re mostly developed in the US, but it’s got me wondering why they would tend towards this direction since I thought they’d want to train on as much data as they could get their hands on (including that which is outside North America/Europe). Maybe it’s an issue of acquisition, language, or something else, but they definitely ought to broaden the scope of their data if they want to push these tools to other markets. Either way, I feel like this case study was catered to me without actually being written for me, and I gained a lot of insight from it. I also felt like I answered the questions much better this time (I didn’t even include them in the post!) and provided some insight of my own.
