How does ChatGPT work? Suppose we ignore for the moment that ChatGPT uses an artificial neural network, and represent its algorithm in the traditional way. This algorithm can be divided into two parts:
- Training: ChatGPT is provided with data (text
files), which are used to build two data sets:
- A
list of all the words that appear in any of the texts, without
repetition, regardless of their order or the number of times each one
appears.
- An
array of indices to the word list, reporting the number of times a given
word occurs after a series of words. For example, if the following series
appears in the texts: time travel,
the indices of the words travel,
and time will appear in the array,
followed by the index of the next word, followed by the number of times
that this sequence of three words appears in the set of texts used for
training.
The
construction of these two data sets (i. e., the training) is the
most expensive and time-consuming part of the process.
- Query: A question is accepted, and the
answer is built by adding words as follows:
- All
or part of the words in the question are searched in the index array, and
the next word is found there. In general, we’ll get several possibilities,
along with the number of times they appear. This is an example:
word1
6 times
word2
4 times
word3
2 times
word4
1 time
word5
1 time
- The
total number of times all these words appear is added up. In the example,
we get 14.
- We
get a random number between 1 and 14. Let’s say it’s 7.
- As
7 is greater than 6 (the number of times the first word occurs), but less
than or equal to 10 (the number of times the first two words occur), word2
is selected. This will be the first word of the answer.
- Word2
is added to the end of the string of words in the question and the
process is repeated. At each step, a new word is added. The succession of
all those words that have been added, when the process stops, makes the response.
Simple, right? So I had the idea to build
a toy model of ChatGPT, using a traditional algorithm instead of an artificial
neural network, to see if comparable results were obtained, despite the
smallness of the model. To train it, I used the text of one of my novels, A face in time, which has about 33,000 words.
The number of different words is 6211. Of course, this is not comparable to the
texts used to train ChatGPT: billions of files taken from the Internet.
Next, I programmed the query algorithm, as
explained above. The program, written in the APL language, consisted of 18
instructions. Yes, just 18 instructions! This is due to the fact that the algorithm
is fairly simple, and APL is a very powerful language: we can find all occurrences
of a series of three indices (B) in an ARRAY of 33,000 rows by writing this:
ARRAY [;1
2 3]˄.=B
As a first query, I proposed this:
time
travel
And surprisingly I got this response:
is
impossible!
I repeated the query and got this
response:
is
impossible as Lavalle said
I repeated the query and got this response:
is one
of their favorite topics
Not bad, right? Awesome? Now I suppose
someone will say that my program shows signs of being conscient, as they are now
saying about ChatGPT.
How are my program and ChatGPT different,
apart from the huge volume of data ChatGPT has been trained on? They are almost
the same, for an artificial neural network is just a way of implementing an
algorithm in a way that does not ensure that the algorithm will always execute
correctly, but in return quite acceptable results are obtained in a very short
time. But ChatGPT has just the same consciousness as my little model, i. e.,
none.
No comments:
Post a Comment