{"id":1637,"date":"2024-05-23T11:38:48","date_gmt":"2024-05-23T11:38:48","guid":{"rendered":"https:\/\/ukti.co.in\/blog\/?p=1637"},"modified":"2024-05-23T11:38:50","modified_gmt":"2024-05-23T11:38:50","slug":"leaked-the-shocking-truth-about-gpt-4os-abilities","status":"publish","type":"post","link":"https:\/\/ukti.co.in\/blog\/2024\/05\/23\/leaked-the-shocking-truth-about-gpt-4os-abilities\/","title":{"rendered":"Leaked: The Shocking Truth About GPT-4o&#8217;s Abilities"},"content":{"rendered":"\n<p class=\"has-medium-font-size\"><strong>Just as we thought the Generative AI hype was starting to settle, OpenAI stirred the waters on May 13, 2024, when it announced a new flagship model for ChatGPT: the GPT-4o (\u201co\u201d for \u201comni\u201d). But did you hear the shocking truth about it?<\/strong>&nbsp;<\/p>\n\n\n\n<p>No, we aren\u2019t talking about its new voice model, which closely resembled Scarlett Johansson. The model was rolled back after the actor made her discontent very clear in a tweet.<\/p>\n\n\n\n<p>We\u2019re referring to a host of new features that GPT-4o brings, including real-time reasoning across audio, vision, and text. OpenAI claims GPT-4o is faster than GPT-4 and significantly better at translation and coding, boasting a more human-like interaction style than its predecessor.<\/p>\n\n\n\n<p>But is it truly as impressive as they make it out to be? Is it the first step towards the multi-modal AI technology we&#8217;ve long been promised? Read on as we put OpenAI&#8217;s claims to the test and uncover the truth about GPT-4o&#8217;s new features and functionalities.<\/p>\n\n\n\n<div class=\"wp-block-aioseo-table-of-contents\"><ul><li><a href=\"#aioseo-overview-of-gpt-4o\">Overview of GPT-4o<\/a><\/li><li><a href=\"#aioseo-how-to-access-gpt-4o\">How to Access GPT-4o?<\/a><\/li><li><a href=\"#aioseo-key-features-of-gpt-4o\">Key Features of GPT-4o<\/a><ul><li><a href=\"#aioseo-1-multimodal-capabilities\">1. Multimodal Capabilities<\/a><\/li><li><a href=\"#aioseo-2-real-time-interaction\">2. Real-Time Interaction<\/a><\/li><li><a href=\"#aioseo-3-enhanced-vision-abilities\">3. Enhanced Vision Abilities<\/a><\/li><li><a href=\"#aioseo-4-multilingual-support\">4. Multilingual Support<\/a><\/li><\/ul><\/li><li><a href=\"#aioseo-gpt-4o-vs-gpt-4-exploring-chatgpts-new-capabilities\">GPT-4o Vs. GPT-4: Exploring ChatGPT\u2019s New Capabilities<\/a><ul><li><a href=\"#aioseo-1-gpt-4o-vs-gpt-4-analyzing-text-input\">1. GPT-4o Vs. GPT-4: Analyzing Text Input<\/a><\/li><li><a href=\"#aioseo-2-gpt-4o-vs-gpt-4-analyzing-image-input\">2. GPT-4o Vs. GPT-4: Analyzing Image Input<\/a><\/li><li><a href=\"#aioseo-3-gpt-4o-vs-gpt-4-analyzing-video-input\">3. GPT-4o Vs. GPT-4: Analyzing Video Input<\/a><\/li><li><a href=\"#aioseo-4-gpt-4o-vs-gpt-4-translation\">4. GPT-4o Vs. GPT-4: Translation<\/a><\/li><li><a href=\"#aioseo-5-gpt-4o-vs-gpt-4-generating-images\">5. GPT-4o Vs. GPT-4: Generating Images<\/a><\/li><\/ul><\/li><li><a href=\"#aioseo-wrapping-up\">Wrapping Up<\/a><\/li><li><a href=\"#aioseo-about-the-author\">About the Author<\/a><\/li><\/ul><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-overview-of-gpt-4o\"><strong>Overview of GPT-4o<\/strong><\/h2>\n\n\n\n<p>GPT-4o is OpenAI\u2019s newest large language model, developed to deliver a more natural human-computer interaction. The model accepts any combination of text, audio, image, and video as input and generates outputs in any combination of text, audio, and image.<\/p>\n\n\n\n<p>While GPT-4 could also understand audio and video inputs, GPT-4o is supposed to be better at the task. According to <a href=\"https:\/\/openai.com\/index\/hello-gpt-4o\/\" target=\"_blank\" rel=\"noopener\" title=\"\">OpenAI<\/a> \u2013<\/p>\n\n\n\n<p><em><strong>\u201cGPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API.\u201d<\/strong><\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-how-to-access-gpt-4o\"><strong>How to Access GPT-4o?<\/strong><\/h2>\n\n\n\n<p>GPT-4o is available to all ChatGPT users. However, free users only get limited prompts. If you\u2019re using the free ChatGPT version, you must upgrade to ChatGPT Plus to get up to 5 time more GPT-4o prompt limits. You can access GPT-4o via ChatGPT web and mobile apps.<\/p>\n\n\n\n<p>Once you\u2019ve upgraded to ChatGPT Plus, GPT-4o will be set as your default language model. If it doesn\u2019t appear, you can manually set GPT-4o as your preferred model from the drop-down menu in the chat interface. If you want, you can also switch back to GPT-4 as well as GPT-3.5.<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh7-us.googleusercontent.com\/8trc5ER8xMnsYKBNCw4J6GyEbou8iSt5UMGjQjP7utZrXDLxny9a0pZNJVMgjd1COtE5vT8MdfQfOhTOlDBCVKIGw-woiuPJ_nEwuByYROeYcDTHJOVe-s0yg4M5LRW-Os-z_W1Gqu76nBQI_kUOmQ\" alt=\"\"\/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-key-features-of-gpt-4o\"><strong>Key Features of GPT-4o<\/strong><\/h2>\n\n\n\n<h5 class=\"wp-block-heading\" id=\"aioseo-1-multimodal-capabilities\">1. <strong>Multimodal Capabilities<\/strong><\/h5>\n\n\n\n<p>In GPT-4o, you can provide input in the form of text, audio, or video. Based on your input, the system will generate text, audio, and image output.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\" id=\"aioseo-2-real-time-interaction\">2. <strong>Real-Time Interaction<\/strong><\/h5>\n\n\n\n<p>With an average response time of 320 milliseconds, GPT-4o\u2019s responsiveness are comparable to human response time in conversations.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\" id=\"aioseo-3-enhanced-vision-abilities\">3. <strong>Enhanced Vision Abilities<\/strong><\/h5>\n\n\n\n<p>GPT-4o is much better at analyzing visual inputs, such as images and videos. This allows the model to understand and generate more accurate textual and visual results based on visual inputs.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\" id=\"aioseo-4-multilingual-support\"><strong>4. Multilingual Support<\/strong><\/h5>\n\n\n\n<p>GPT-4o supports 50+ languages and comes with significant advancements in text processing for non-English languages.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-gpt-4o-vs-gpt-4-exploring-chatgpts-new-capabilities\"><strong>GPT-4o Vs. GPT-4: Exploring ChatGPT\u2019s New Capabilities<\/strong><\/h2>\n\n\n\n<p>Let\u2019s put GPT-4o to the test and see how it fares against its predecessor, the GPT-4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-1-gpt-4o-vs-gpt-4-analyzing-text-input\"><strong>1. GPT-4o Vs. GPT-4: Analyzing Text Input<\/strong><\/h3>\n\n\n\n<p>To test the two GPT models&#8217; text-analyzing capacity, we entered a simple prompt instructing them to generate a poem. We wanted to assess the speed at which they analyzed the text and generated a result.<\/p>\n\n\n\n<p>Here\u2019s a side-by-side comparison of the output produced by GPT-4o and GPT-4:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-us.googleusercontent.com\/2-Uxa-K6fuCpJhBkPIAGphWO7kGeJufmu260OuWw1CT5gCQt13GipaJbET9wnH73C6vZvVUJWu0Ma1tgwtaoi4whVnqpppgyJYqtTPm5UQw6bL0R-05thH3IJccsYkx4Cf_kkWu2rO_IenhODz1NHg\" alt=\"Side-by-side comparison of the text output produced by GPT-4o and GPT-4 based on a text input.\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>GPT-4o clearly outpaces its predecessor, as it not only produced a longer poem from our prompt but is took less time than GPT-4. It also enhanced the depth and detail in its depiction of the characters and settings.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-2-gpt-4o-vs-gpt-4-analyzing-image-input\"><strong>2. GPT-4o Vs. GPT-4: Analyzing Image Input<\/strong><\/h3>\n\n\n\n<p>Next, we wanted to test how good GPT-4o is at analyzing images and generating output based on them. We also tested its speed and accuracy against the older GPT version. Here are the results:<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/lh7-us.googleusercontent.com\/XSd26Dwz1G018WyBYsiPZ_tLv5cwNfqZ6U3EDbnQ72qNiUqE78Rax-f2zDF1FuRXur7vj395Jf9NZGTv9i7fLakd4c9HwD_BgSHGnGAzUkD2shA0KFcSKwVHDsdl9ZyOuXnzWXa1y5pooGk1txcRCQ\" alt=\"Side-by-side comparison of the text output produced by GPT-4o and GPT-4 based on a prompt with an image attached\"\/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<p>Again, GPT-4o leaves its predecessor far behind in terms of speed. It analyzed the image and generated a text output based on the prompt accompanying the picture. However, the content of the text was almost similar in terms of words and phrases used and overall quality.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-3-gpt-4o-vs-gpt-4-analyzing-video-input\"><strong>3. GPT-4o Vs. GPT-4: Analyzing Video Input<\/strong><\/h3>\n\n\n\n<p>Until now, ChatGPT users only had the option to enter text, images, and documents as input because GPT-4 and other models didn\u2019t support video inputs. However, with GPT-4o, OpenAI has enabled this functionality.&nbsp;<\/p>\n\n\n\n<p>We can attach video files with our prompts, and ChatGPT will generate the requested output after analyzing the video. To test this feature, we asked ChatGPT to analyze a video of our GPT-4o vs. GPT-4 comparison from the previous section.&nbsp;&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"987\" src=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-163320-1024x987.png\" alt=\"Snapshot showing the GPT-4o's ability to analyze video files and produce an output based on it\" class=\"wp-image-1643\" srcset=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-163320-1024x987.png 1024w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-163320-300x289.png 300w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-163320-768x741.png 768w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-163320.png 1065w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<p>Responding to our initial prompt, GPT-4o provided details like the video&#8217;s title, duration, and resolution. However, when given a follow-up prompt, it returned a detailed analysis of the introduction and responses from the two GPT models.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-4-gpt-4o-vs-gpt-4-translation\"><strong>4. GPT-4o Vs. GPT-4: Translation<\/strong><\/h3>\n\n\n\n<p>OpenAI especially emphasized GPT-4o\u2019s performance with non-English text. So, let\u2019s test how good it is at translation. We asked ChatGPT to translate an excerpt from Mandarin to English using GPT-4 and GPT-4o. See the comparison below:<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"384\" src=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1-1024x384.jpg\" alt=\"Side-by-side comparison of translation capabilities of GPT-4o and GPT-4\" class=\"wp-image-1642\" srcset=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1-1024x384.jpg 1024w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1-300x113.jpg 300w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1-768x288.jpg 768w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1-1536x576.jpg 1536w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-translation-1.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<p>While both translations are almost similar, we noticed that GPT-4o was significantly faster, which has been the case in every comparison we have done so far. We also used Google Translate to verify the accuracy of these translations. The result was similar, with only minor differences.<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" loading=\"lazy\" width=\"771\" height=\"923\" src=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-165228.png\" alt=\"Snapshot of Google Translate\" class=\"wp-image-1644\" srcset=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-165228.png 771w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-165228-251x300.png 251w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/Screenshot-2024-05-22-165228-768x919.png 768w\" sizes=\"(max-width: 771px) 100vw, 771px\" \/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aioseo-5-gpt-4o-vs-gpt-4-generating-images\"><strong>5. GPT-4o Vs. GPT-4: Generating Images<\/strong><\/h3>\n\n\n\n<p>Image generation with ChatGPT has been a hit-and-miss situation, especially if the image has to include text. Users frequently face issues like wrong spellings, uneven spaces, and blurred letters. So, is GPT-4o any better? Let\u2019s find out.<\/p>\n\n\n\n<p>We used the same prompt to generate an image using GPT-4o and GPT-4. See the comparison below:<\/p>\n\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"1024\" height=\"384\" src=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation-1024x384.jpg\" alt=\"Side-by-side comparison of image generation in GPT-4o and GPT-4\" class=\"wp-image-1640\" srcset=\"https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation-1024x384.jpg 1024w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation-300x113.jpg 300w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation-768x288.jpg 768w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation-1536x576.jpg 1536w, https:\/\/ukti.co.in\/blog\/wp-content\/uploads\/2024\/05\/GPT-image-generation.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n<p><\/p>\n\n\n\n<p>In our brief testing, we found only marginal differences in the images generated by the two GPT models. There was hardly any improvement in the quality or accuracy of text in images created using GPT-4o compared to results from GPT-4.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-wrapping-up\"><strong>Wrapping Up<\/strong><\/h2>\n\n\n\n<p>GPT-4o is a welcome addition to ChatGPT, primarily because it\u2019s twice as fast and 50% cheaper. It can analyze text, images, and videos in prompts and respond to queries almost in real time. While there\u2019s still room for improvement in the quality of output generated, we definitely noticed how little time it takes to produce them.<\/p>\n\n\n\n<p>But where GPT-4o really shines is its ability to understand video and voice inputs. With human-like response times and better accuracy, ChatGPT has taken the lead over competitors and become the first true multi-modal gen AI platform.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"aioseo-about-the-author\"><strong>About the Author<\/strong><\/h2>\n\n\n<div class=\"wp-block-post-author-biography\">A content writer by the day and avid reader by anytime remaining, Akshada\u2019s genres range from philosophy to good old-fashioned classics. Anything that can add value to the human mind has the ability to grab her attention; the question often is whether it can hold it.<\/div>","protected":false},"excerpt":{"rendered":"<p>Just as we thought the Generative AI hype was starting to settle, OpenAI stirred the waters on May 13, 2024, when it announced a new flagship model for ChatGPT: the GPT-4o (\u201co\u201d for \u201comni\u201d). But did you hear the shocking truth about it?&nbsp; No, we aren\u2019t talking about its new voice model, which closely resembled &#8230; <a title=\"Leaked: The Shocking Truth About GPT-4o&#8217;s Abilities\" class=\"read-more\" href=\"https:\/\/ukti.co.in\/blog\/2024\/05\/23\/leaked-the-shocking-truth-about-gpt-4os-abilities\/\" aria-label=\"More on Leaked: The Shocking Truth About GPT-4o&#8217;s Abilities\">Read more<\/a><\/p>\n","protected":false},"author":10,"featured_media":1646,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[29],"tags":[65],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/posts\/1637"}],"collection":[{"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/comments?post=1637"}],"version-history":[{"count":1,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/posts\/1637\/revisions"}],"predecessor-version":[{"id":1647,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/posts\/1637\/revisions\/1647"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/media\/1646"}],"wp:attachment":[{"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/media?parent=1637"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/categories?post=1637"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ukti.co.in\/blog\/wp-json\/wp\/v2\/tags?post=1637"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}