GPT-4 Release Imminent: Multimodal and Disruptive Force in AI
Microsoft Germany’s CTO Andreas Braun announced the upcoming release of GPT-4 at an AI kickoff event on March 9, 2023. The event, titled “AI in Focus – Digital Kickoff,” highlighted the disruptive impact of Large Language Models (LLMs) like the GPT series, as well as Microsoft’s Azure-OpenAI offering. The event was conducted in German, with news outlet Heise in attendance. Braun casually mentioned GPT-4, and highlighted its multimodal capability, citing videos as one of the possibilities. Braun dubbed LLMs as game-changers because they teach machines to understand natural language, previously only readable and understandable by humans. Multimodal, he said, would make the models more comprehensive.
Braun was joined by Marianne Janik, CEO of Microsoft Germany, who stressed the value-creation potential of AI in companies. Janik emphasised that AI development, including ChatGPT, was akin to the “iPhone moment,” not necessarily a replacement for jobs but a means to do repetitive tasks in a different way. She also talked about emerging professions arising from AI’s new possibilities, and recommended companies to form internal “competence centres” for training employees in AI use and project ideation.
Clemens Sieber and Holger Kenn, Senior AI Specialist and Chief Technologist Business Development AI & Emerging Technologies at Microsoft Germany, respectively, discussed practical AI use cases and their teams’ current projects. Kenn explained the use of multimodal AI in text-to-image, music, and video translation. He also talked about embeddings, used to internally represent text in the model, in addition to the GPT-3.5 model class. Siebler shared a use case involving speech-to-text telephone calls recorded and summarised using AI, saving a large customer in the Netherlands 500 working hours a day.
However, Siebler emphasised that the AI would not always answer correctly, and validation was necessary. The issue of operational reliability and fact fidelity also arose during the Q&A, but no concrete answers were provided.
GPT-4 and Multimodal
Braun’s announcement of GPT-4 highlighted its multimodal capability. Multimodal, the use of multiple modes to convey information, has significant implications for AI, particularly for natural language processing. Braun suggested that with multimodal, Microsoft-OpenAI could make GPT-4 more comprehensive, opening up a plethora of possibilities. One example Braun provided was videos, implying that GPT-4’s multimodal could incorporate video as one of the modes, making it a more dynamic tool.
Disruption and Emerging Professions
Janik emphasised that AI development was a turning point in time, not necessarily replacing jobs but doing repetitive tasks in a different way. She pointed out the emergence of new professions from AI’s new possibilities and recommended companies to create internal “competence centres” to train employees and ideate projects. These centres would also help with the migration of “old darlings” to more enriching and value-adding AI uses.
Practical Use Cases
Sieber and Kenn’s presentations highlighted practical use cases and their teams’ current projects. Kenn discussed the use of embeddings and GPT-3.5 models for multimodal AI. Sieber’s use case involved recording and summarising speech-to-text telephone calls using AI, saving a large customer in the Netherlands 500 working hours a day. However, Siebler emphasised the need for validation, as the AI would not always provide the correct answers.
Conclusion
The imminent release of GPT-4 marks another significant step towards more comprehensive AI capabilities. The emphasis on multimodality, as highlighted by Microsoft Germany's CTO Andreas Braun, suggests that GPT-4 will be a more dynamic tool, opening up a plethora of possibilities. In addition, Marianne Janik, CEO of Microsoft Germany, stressed the emergence of new professions from AI's new possibilities and recommended companies to create internal "competence centres" to train employees and ideate projects. While practical use cases presented by Clemens Sieber and Holger Kenn demonstrated the potential of AI, Sieber emphasised the need for validation and fact fidelity. As the release of GPT-4 draws near, its disruptive force and potential for transformation are increasingly apparent, making it an exciting time for AI development and its future applications.