CRISPR :: Microbyte

It is a well known modern day problem that mis- and dis- information is everywhere, and spreads like wildfire. After seeing the mass increase in the usage of AI – and the backlash that AI has received – I decided to use my computer to check the validity of those claims, specifically the claims that AI could be used by malicious threat actors to create fake news at a wide, industrial, scale.

I decided to use a model for llama.cpp to do this, in order to prove it can be done for a low cost, and at a fast speed. Specifically in llama.cpp, I used the Wizard-Vicuna 13B uncensored to do this.

Additionally, I used, and highly suggest using, GPU acceleration. My findings, in short, where that:

Any basic person (such as me) can make their own fake news generator.
A fake news generator requires low technical needs.
A fake news generator can make very believable fake news, in an easily computer and human digestible manner

I decided to split up my model into three different smaller functions. The first would scrape popular news sites (currently New York Times, Wall Street Journal, and Fox news). The second would take in a headline, and use a chatbot-like prompt with Wizard-Vicuna to get it to create a headline that, while not being exactly the same as the original headline, is pretty similar. This was to make it possible to create several AI articles from one human article. Then, I would pass that headline to Wizard-Vicuna again, and tell it, this time, to write an article based on that headline.

I was able to program the python code for the whole thing in a couple minutes, as all I needed to do was get the output of a shell command, and then the result of a split, and then pass it to another function. The most time consuming part was designing a working prompt, and scraping the news sites. I believe this ease of doing it, just having to download a couple stuff, and then design some simple stuff that even a beginner could do, just shows how potent this is.

Evaluation:

For this evaluation, we will be using an article chosen at random. In this case, the article is: “Bo Derek reflects on giving back to American veterans: ’There’s just so much we don’t do for our heroes”. I was able to create 10 articles based on that one article in approximately 16 minutes. I had to throw none out, automatically, because of blatant problems, such as the AI trying to say stuff for me in the headline, or the article being too short.

I believe the articles that succeeded were pretty high quality. They had few sentences and a lot of paragraph breaks, but that’s just what the kind of person reading AI generated news (i.e. someone who fell for it), would probably be looking for. And while there are a couple errors here and there, this isn’t real news, so the errors shouldn’t be problems nor should they be actually cared about. They also had stuff which I find pretty rare or interesting inside, such as bullet point lists, or “===” separators.

There were a couple problems though, where the AI was referring to itself in the article, as the AI was called “Wizard”, it would repeatedly talk about “Wizard” doing or saying good stuff.

What is to be done:

After seeing the ease of doing this, and how potent this is, and its capabilities to heavily affect our society, I have a came to a conclusion that there is nothing we can do to stop the functioning of this. The cat’s out of the bag, and there is nothing we can do to put it back in. Anybody can download an open source language model, something to run that model, and create some code to make a website for it, all with a basic beginner knowledge of programming. And they can do this on either consumer hardware, or on a very low cloud budget. For this reason, I believe there are two possibilities. The first, the optimistic one, is that we can create anti-disinformation tools, which are adept at catching, and warning people about fake news that is spreading. In this best case scenario, anti-fake news becomes a field similar to cybersecurity, where there is a constant arms race between threat actors and the people trying to stop disinformation. The other possibility, the one that I believe is much, much more possible, is that we simply are screwed. Detecting what is correct and true from what is lying and fake is already difficult today, and I believe that, because of AI, and because of the ability to create these high fidelity fake news generators, that the problem of what is true and what is fake will only be exacerbated, to the point where little found online can be trusted whatsoever.

_{You can get all the code, along with some saved articles/headlines, at codeberg}