Thursday, June 8, 2023

A model for ChatGPT

How does ChatGPT work? Suppose we ignore for the moment that ChatGPT uses an artificial neural network, and represent its algorithm in the traditional way. This algorithm can be divided into two parts:

  1. Training: ChatGPT is provided with data (text files), which are used to build two data sets:
    1. A list of all the words that appear in any of the texts, without repetition, regardless of their order or the number of times each one appears.
    2. An array of indices to the word list, reporting the number of times a given word occurs after a series of words. For example, if the following series appears in the texts: time travel, the indices of the words travel, and time will appear in the array, followed by the index of the next word, followed by the number of times that this sequence of three words appears in the set of texts used for training.

The construction of these two data sets (i. e., the training) is the most expensive and time-consuming part of the process.

  1. Query: A question is accepted, and the answer is built by adding words as follows:
    1. All or part of the words in the question are searched in the index array, and the next word is found there. In general, we’ll get several possibilities, along with the number of times they appear. This is an example:

word1 6 times

word2 4 times

word3 2 times

word4 1 time

word5 1 time

    1. The total number of times all these words appear is added up. In the example, we get 14.
    2. We get a random number between 1 and 14. Let’s say it’s 7.
    3. As 7 is greater than 6 (the number of times the first word occurs), but less than or equal to 10 (the number of times the first two words occur), word2 is selected. This will be the first word of the answer.
    4. Word2 is added to the end of the string of words in the question and the process is repeated. At each step, a new word is added. The succession of all those words that have been added, when the process stops, makes the response.

Simple, right? So I had the idea to build a toy model of ChatGPT, using a traditional algorithm instead of an artificial neural network, to see if comparable results were obtained, despite the smallness of the model. To train it, I used the text of one of my novels, A face in time, which has about 33,000 words. The number of different words is 6211. Of course, this is not comparable to the texts used to train ChatGPT: billions of files taken from the Internet.

Next, I programmed the query algorithm, as explained above. The program, written in the APL language, consisted of 18 instructions. Yes, just 18 instructions! This is due to the fact that the algorithm is fairly simple, and APL is a very powerful language: we can find all occurrences of a series of three indices (B) in an ARRAY of 33,000 rows by writing this:

ARRAY [;1 2 3]˄.=B

As a first query, I proposed this:

time travel

And surprisingly I got this response:

is impossible!

I repeated the query and got this response:

is impossible as Lavalle said

I repeated the query and got this response:

is one of their favorite topics

Not bad, right? Awesome? Now I suppose someone will say that my program shows signs of being conscient, as they are now saying about ChatGPT.

How are my program and ChatGPT different, apart from the huge volume of data ChatGPT has been trained on? They are almost the same, for an artificial neural network is just a way of implementing an algorithm in a way that does not ensure that the algorithm will always execute correctly, but in return quite acceptable results are obtained in a very short time. But ChatGPT has just the same consciousness as my little model, i. e., none.


Thematic Thread about Natural and Artificial Intelligence: Previous Next
Manuel Alfonseca

No comments:

Post a Comment