{"id":17047,"date":"2024-05-20T12:19:59","date_gmt":"2024-05-20T12:19:59","guid":{"rendered":"https:\/\/metaverseplanet.net\/blog\/?p=17047"},"modified":"2026-01-21T11:36:01","modified_gmt":"2026-01-21T11:36:01","slug":"introducing-gpt-4o","status":"publish","type":"post","link":"https:\/\/metaverseplanet.net\/blog\/introducing-gpt-4o\/","title":{"rendered":"Introducing GPT-4o: The Free Language Model with Unbelievable Abilities"},"content":{"rendered":"\n<p>OpenAI held its long-awaited event today, unveiling its new flagship model, GPT-4o. This model can speak, see, and hear like a real person.<\/p>\n\n\n\n<p>As a pioneer in the artificial intelligence revolution, OpenAI has consistently amazed the world with its innovative models. During today&#8217;s event, the company made several significant announcements, including updates to the ChatGPT chatbot and the GPT-4 language model.<\/p>\n\n\n\n<p>The highlight of the event was the introduction of GPT-4o, the company&#8217;s new flagship language model. Not only does it outperform the current GPT model, but it is also significantly faster.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">GPT-4o; Can reason through voice, text and image<\/h2>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-1024x576.jpeg\" alt=\"Introducing GPT-4o: The Free Language Model with Unbelievable Abilities\" class=\"wp-image-17065\" srcset=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-1024x576.jpeg 1024w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-300x169.jpeg 300w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-768x432.jpeg 768w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-1536x864.jpeg 1536w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-390x220.jpeg 390w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-150x84.jpeg 150w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-1-scaled.jpeg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The new GPT-4o model that the company will offer to its users will power the ChatGPT chatbot. Described as much more efficient and advanced than previous versions of GPT, this model can reason through voice, text, and images. According to the statements, GPT-4o is a built-in multi-modal artificial intelligence model, meaning it can understand audio, text, and images and produce content accordingly.<\/p>\n\n\n\n<p>There is a significant improvement in the voice response capabilities. Users can now have real-time conversations with the model, which feels much more realistic with reduced lag. According to OpenAI, GPT-4o can respond to sound in as little as 232 milliseconds, almost as fast as talking to a human. Previously, delays in voice mode averaged 2.8 seconds.<\/p>\n\n\n\n<p>Additionally, you can even interrupt ChatGPT and ask it to change its response while it is replying. For example, during the live demo at the event, OpenAI executives asked the model to tell a story about a robot. As the model was speaking, they interrupted and requested it to tell the story with different emotions. ChatGPT instantly made the change, fulfilling their request. You can check out these moments in the video above.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-1024x576.jpeg\" alt=\"Introducing GPT-4o: The Free Language Model with Unbelievable Abilities\" class=\"wp-image-17066\" srcset=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-1024x576.jpeg 1024w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-300x169.jpeg 300w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-768x432.jpeg 768w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-390x220.jpeg 390w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2-150x84.jpeg 150w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-2.jpeg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The model&#8217;s built-in advanced visual capabilities were also demonstrated. The model can &#8220;see&#8221; and comment on what is shown to it through the device&#8217;s camera. For example, in a demo, an equation written on paper was shown to the model, and it was asked to help solve it. ChatGPT assisted in finding the solution. When &#8220;I Love You ChatGPT&#8221; was written on the paper, it responded with an emotional voice, just like a human.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">It is able to translate in real-time surprisingly well<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">OpenAI has just demonstrated its new GPT-4o model doing real-time translations \ud83e\udd2f <a href=\"https:\/\/t.co\/Cl0gp9v3kN\">pic.twitter.com\/Cl0gp9v3kN<\/a><\/p>&mdash; Tom Warren (@tomwarren) <a href=\"https:\/\/twitter.com\/tomwarren\/status\/1790074556981403997?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>*<em>If you can&#8217;t view the video&nbsp;<strong><a href=\"https:\/\/x.com\/tomwarren\/status\/1790074556981403997?utm_source=metaverseplanett\" rel=\"noreferrer noopener\" target=\"_blank\">link here<\/a>&nbsp;<\/strong>Click.<\/em><\/p>\n\n\n\n<p>Another demo at the event showcased the translation capabilities of the new model. OpenAI demonstrated how GPT-4o can translate in real-time. Mira Murati spoke in Italian, while other OpenAI employees spoke in English. The model quickly translated sentences and conveyed them to the other side with almost no delay.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">It can read your screen through the desktop application and help with coding:<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">With the GPT-4o\/ChatGPT desktop app, you can have a coding buddy (black circle) that talks to you and sees what you see!<a href=\"https:\/\/twitter.com\/hashtag\/openai?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#openai<\/a> announcements thread! <a href=\"https:\/\/t.co\/CpvCkjI0iA\">https:\/\/t.co\/CpvCkjI0iA<\/a> <a href=\"https:\/\/t.co\/Tfh81mBHCv\">pic.twitter.com\/Tfh81mBHCv<\/a><\/p>&mdash; andrew gao (@itsandrewgao) <a href=\"https:\/\/twitter.com\/itsandrewgao\/status\/1790069934157910274?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>*<em>If you can&#8217;t view the video&nbsp;<strong><a href=\"https:\/\/x.com\/itsandrewgao\/status\/1790069934157910274?utm_source=metaverseplanet\" rel=\"noreferrer noopener\" target=\"_blank\">link here<\/a>&nbsp;<\/strong>Click.<\/em><\/p>\n\n\n\n<p>In another live demo of GPT-4o, some of its coding capabilities were demonstrated. Through ChatGPT&#8217;s new desktop application, the model was able to understand and analyze the code that was written. It also provided explanations of the code it analyzed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">GPT-4o is able to look at you through the camera and make comments:<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: <a href=\"https:\/\/t.co\/MYHZB79UqN\">https:\/\/t.co\/MYHZB79UqN<\/a><br><br>Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. <a href=\"https:\/\/t.co\/uuthKZyzYx\">pic.twitter.com\/uuthKZyzYx<\/a><\/p>&mdash; OpenAI (@OpenAI) <a href=\"https:\/\/twitter.com\/OpenAI\/status\/1790072174117613963?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>*<em>If you can&#8217;t view the video&nbsp;<strong><a href=\"https:\/\/x.com\/OpenAI\/status\/1790072174117613963?utm_source=metaverseplanet\" rel=\"noreferrer noopener\" target=\"_blank\">link here<\/a>&nbsp;<\/strong>Click.<\/em><\/p>\n\n\n\n<p>OpenAI shared videos of GPT-4o&#8217;s capabilities after the event ended. For example, in one video, we can see a conversation between the model and a human through the front camera. ChatGPT is able to understand how the person looks through the camera and make comments about their appearance. It even comments on the environment in which the person is located. It would not be wrong to say that it has become difficult to distinguish the model from a human.<\/p>\n\n\n\n<p>In another demo, we can see that the model can understand facial expressions and emotional states by looking at a user&#8217;s face. For example, it says, &#8220;You look so happy and cheerful with a smile on your face and a little excitement.&#8221;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">He even makes sarcastic jokes:<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">OpenAI&#39;s new GPT-4o model being sarcastic \ud83d\ude05<a href=\"https:\/\/t.co\/ndItPzKhrN\">https:\/\/t.co\/ndItPzKhrN<\/a> <a href=\"https:\/\/t.co\/uywEPGZ9CO\">pic.twitter.com\/uywEPGZ9CO<\/a><\/p>&mdash; Tom Warren (@tomwarren) <a href=\"https:\/\/twitter.com\/tomwarren\/status\/1790079179892858998?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>*<em>If you can&#8217;t view the video&nbsp;<strong><a href=\"https:\/\/x.com\/tomwarren\/status\/1790079179892858998?utm_source=metaverseplanet\" rel=\"noreferrer noopener\" target=\"_blank\">link here<\/a>&nbsp;<\/strong>Click.<\/em><\/p>\n\n\n\n<p>In another example, the model is asked to make sarcastic jokes. GPT-4o can indeed speak sarcastically and make jokes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Here&#8217;s an example of you being able to interrupt and get what you want:<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Fast counting with GPT-4o\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/G8sm27sGUu4?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>In this video, the model is asked to count to 10. The OpenAI employee interrupts it after it starts counting and asks it to count faster. We can hear that it successfully fulfills all requests, even saying &#8220;OK&#8221; as if it is tired of some of them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Two GPT-4os chatting, singing together<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Two GPT-4os interacting and singing\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/MirzFk_DSiI?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>In a video, we can see GPT-4o chatting with another GPT-4o. They even sing a duet towards the end of the video.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">This is how the model reacts when she sees a dog:<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Dog meets GPT-4o\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/HU_4vMu9xFI?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">It can be the &#8220;eye&#8221; of the visually impaired<\/h2>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Be My Eyes Accessibility with GPT-4o\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/KwNUJ69RbwY?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>In another example, we see a potentially useful use case for the model&#8217;s visual capabilities. Here, the model acts as the eyes of a visually impaired individual by describing their surroundings to them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">GPT-4o will also be available to free users!<\/h2>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-1024x576.jpeg\" alt=\"\" class=\"wp-image-17067\" srcset=\"https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-1024x576.jpeg 1024w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-300x169.jpeg 300w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-768x432.jpeg 768w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-390x220.jpeg 390w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3-150x84.jpeg 150w, https:\/\/metaverseplanet.net\/blog\/wp-content\/uploads\/2024\/05\/Introducing-GPT-4o-The-Free-Language-Model-with-Unbelievable-Abilities-3.jpeg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>OpenAI also shared exciting news about the language model. According to the company, GPT-4o can be used through the free version of ChatGPT without any cost. However, there will be a message limit, and once this limit is exceeded, it will automatically revert to GPT-3.5. The company stated that GPT-4o&#8217;s text and image capabilities are being rolled out starting today. The new voice mode will be available to Plus users in the future.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"javascript:void(0)\"><\/a>You may also like this content<\/h3>\n\n\n<ul class=\"wp-block-latest-posts__list wp-block-latest-posts\"><\/ul>","protected":false},"excerpt":{"rendered":"<p>OpenAI held its long-awaited event today, unveiling its new flagship model, GPT-4o. This model can speak, see, and hear like a real person. As a pioneer in the artificial intelligence revolution, OpenAI has consistently amazed the world with its innovative models. During today&#8217;s event, the company made several significant announcements, including updates to the ChatGPT &hellip;<\/p>\n","protected":false},"author":1,"featured_media":13280,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"googlesitekit_rrm_CAown96uCw:productID":"","footnotes":""},"categories":[332],"tags":[333,65],"class_list":["post-17047","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-information","tag-ai-blog","tag-chatgpt-news-and-content"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/posts\/17047","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/comments?post=17047"}],"version-history":[{"count":0,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/posts\/17047\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/media\/13280"}],"wp:attachment":[{"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/media?parent=17047"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/categories?post=17047"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/metaverseplanet.net\/blog\/wp-json\/wp\/v2\/tags?post=17047"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}