GPT models, also known as Generative Pretrained Transformer models, have recently become a hot topic, largely thanks to ChatGPT which was introduced at the end of last year. But how do these models actually work and what are the possibilities and limitations?
GPT-4 from OpenAI is currently the most powerful generative language model developed to date. It is trained in two steps, in the initial stage of training the model is fed large amounts of text from the internet which gives it a good understanding of words and language structures. However, after the first step, the model has no understanding of what it means to, for example, answer questions. The next step, Fine-tuning, is where the model is trained on specific data to adapt the model to different tasks and in this case to a chatbot that can answer questions.
Generative language models like GPT-4 have a wide range of uses, and they can be very powerful tools in different contexts. For example, to develop advanced customer support, an assistant for employees, or a chatbot for advice on complex topics such as medical or legal. Language models can also be used as a versatile tool, for example for language translation, proofreading, analyzing complex data, or a basis for inspiration.
During the summer, we, three students from Chalmers University och Technology developed a chatbot product for Zenseact using GPT-3.5, the predecessor to GPT-4. When a user asks a question to the chatbot, the most suitable sources from the company's internal documentation are taken using both embedding technology and our own filtering equation. This relevant data along with basic information and the question itself are given to GPT-3.5 and an answer is generated.
But despite these impressive possibilities, GPT models come with some limitations. First, even though they can generate text that sounds human, the model does not understand the real meaning behind the words they produce. It lacks the ability to understand the context in which it writes, which can lead to misunderstandings and incorrect information.
Secondly, since GPT models are trained on very large amounts of data, there is a risk that they reproduce prejudices and stereotypes that are included in the data the model has been trained on. This can lead to them generating text that is biased or offensive. Another challenge with these models is that they require enormous amounts of data and computational power to train. This makes them difficult and expensive to use on their own.
It is important that we continue to develop and improve these models, while being aware of their limitations and potential risks. In the end, as with all technology, the responsibility for how GPT models are used lies with us.