AI August 20, 2024

What is GPT-4o explained


post-thumb

GPT-4o explained: Everything you need to know

img]

OpenAI is one of the defining vendors of the generative AI era.

The foundation of OpenAI’s success and popularity is the company’s GPT family of large language models (LLM), including GPT-3 and GPT-4, alongside the company’s ChatGPT conversational AI service.

OpenAI announced GPT-4 Omni (GPT-4o) as the company’s new flagship multimodal language model on May 13, 2024, during the company’s Spring Updates event. As part of the event, OpenAI released multiple videos demonstrating the intuitive voice response and output capabilities of the model.

In July 2024, OpenAI launched a smaller version of GPT-4o – GPT-4o mini. This is its most advanced small model.

What is GPT-4o? GPT-4o is the flagship model of the OpenAI LLM technology portfolio. The O stands for Omni and isn’t just some kind of marketing hyperbole, but rather a reference to the model’s multiple modalities for text, vision and audio. The GPT-4o model marks a new evolution for the GPT-4 LLM that OpenAI first released in March 2023. This isn’t the first update for GPT-4 either, as the model first got a boost in November 2023, with the debut of GPT-4 Turbo. The GPT acronym stands for Generative Pre-Trained Transformer. A transformer model is a foundational element of generative AI, providing a neural network architecture that is able to understand and generate new outputs. This article is part of What is generative AI? Everything you need to know Which also includes:

8 top generative AI tool categories for 2024

8 top generative AI tool categories for 2024 Will AI replace jobs? 9 job types that might be affected

Will AI replace jobs? 9 job types that might be affected 19 of the best large language models in 2024 GPT-4o goes beyond what GPT-4 Turbo provided in terms of both capabilities and performance. As was the case with its GPT-4 predecessors, GPT-4o can be used for text generation use cases, such as summarization and knowledge-based question and answer. The model is also capable of reasoning, solving complex math problems and coding. The GPT-4o model introduces a new rapid audio input response that – according to OpenAI – is similar to a human, with an average response time of 320 milliseconds. The model can also respond with an AI-generated voice that sounds human. Rather than having multiple separate models that understand audio, images – which OpenAI refers to as vision – and text, GPT-4o combines those modalities into a single model. As such, GPT-4o can understand any combination of text, image and audio input and respond with outputs in any of those forms. The promise of GPT-4o and its high-speed audio multimodal responsiveness is that it allows the model to engage in more natural and intuitive interactions with users. GPT-4o mini is OpenAI’s fastest model and offers applications at a lower cost. GPT-4o mini is smarter than GPT-3.5 Turbo and is 60% cheaper. The training data goes through October 2023. GPT-4o mini is available in text and vision models for developers through Assistants API, Chat Completions API and Batch API. The mini version is also available on ChatGPT, Free, Plus and Team for users.

What can GPT-4o do? At the time of its release, GPT-4o was the most capable of all OpenAI models in terms of both functionality and performance. The many things that GPT-4o can do include the following: Real-time interactions. The GPT-4o model can engage in real-time verbal conversations without any real noticeable delays.

The GPT-4o model can engage in real-time verbal conversations without any real noticeable delays. Knowledge-based Q&A. As was the case with all prior GPT-4 models, GPT-4o has been trained with a knowledge base and is able to respond to questions.

As was the case with all prior GPT-4 models, GPT-4o has been trained with a knowledge base and is able to respond to questions. Text summarization and generation. As was the case with all prior GPT-4 models, GPT-4o can execute common text LLM tasks including text summarization and generation.

As was the case with all prior GPT-4 models, GPT-4o can execute common text LLM tasks including text summarization and generation. Multimodal reasoning and generation. GPT-4o integrates text, voice and vision into a single model, allowing it to process and respond to a combination of data types. The model can understand audio, images and text at the same speed. It can also generate responses via audio, images and text.

GPT-4o integrates text, voice and vision into a single model, allowing it to process and respond to a combination of data types. The model can understand audio, images and text at the same speed. It can also generate responses via audio, images and text. Language and audio processing. GPT-4o has advanced capabilities in handling more than 50 different languages.

GPT-4o has advanced capabilities in handling more than 50 different languages. Sentiment analysis. The model understands user sentiment across different modalities of text, audio and video.

The model understands user sentiment across different modalities of text, audio and video. Voice nuance. GPT-4o can generate speech with emotional nuances. This makes it effective for applications requiring sensitive and nuanced communication.

GPT-4o can generate speech with emotional nuances. This makes it effective for applications requiring sensitive and nuanced communication. Audio content analysis. The model can generate and understand spoken language, which can be applied in voice-activated systems, audio content analysis and interactive storytelling

The model can generate and understand spoken language, which can be applied in voice-activated systems, audio content analysis and interactive storytelling Real-time translation. The multimodal capabilities of GPT-4o can support real-time translation from one language to another.

The multimodal capabilities of GPT-4o can support real-time translation from one language to another. Image understanding and vision. The model can analyze images and videos, allowing users to upload visual content that GPT-4o will understand, be able to explain and provide analysis for.

The model can analyze images and videos, allowing users to upload visual content that GPT-4o will understand, be able to explain and provide analysis for. Data analysis. The vision and reasoning capabilities can enable users to analyze data that is contained in data charts. GPT-4o can also create data charts based on analysis or a prompt.

The vision and reasoning capabilities can enable users to analyze data that is contained in data charts. GPT-4o can also create data charts based on analysis or a prompt. File uploads. Beyond the knowledge cutoff, GPT-4o supports file uploads, letting users analyze specific data for analysis.

Beyond the knowledge cutoff, GPT-4o supports file uploads, letting users analyze specific data for analysis. Memory and c ontextual a wareness. GPT-4o can remember previous interactions and maintain context over longer conversations.

GPT-4o can remember previous interactions and maintain context over longer conversations. Large context window. With a context window supporting up to 128,000 tokens, GPT-4o can maintain coherence over longer conversations or documents, making it suitable for detailed analysis.

With a context window supporting up to 128,000 tokens, GPT-4o can maintain coherence over longer conversations or documents, making it suitable for detailed analysis. Reduced hallucination and improved safety. The model is designed to minimize the generation of incorrect or misleading information. GPT-4o includes enhanced safety protocols to ensure outputs are appropriate and safe for users.

How to use GPT-4o There are several ways users and organizations can use GPT-4o. ChatGPT Free. The GPT-4o model is set to be available to free users of OpenAI’s ChatGPT chatbot. When available, GPT-4o will replace the current default for ChatGPT Free users. ChatGPT Free users will have restricted message access and will not get access to some advanced features including vision, file uploads and data analysis.

The GPT-4o model is set to be available to free users of OpenAI’s ChatGPT chatbot. When available, GPT-4o will replace the current default for ChatGPT Free users. ChatGPT Free users will have restricted message access and will not get access to some advanced features including vision, file uploads and data analysis. ChatGPT Plus. Users of OpenAI’s paid service for ChatGPT will get full access to GPT-4o, without the feature restrictions that are in place for free users.

Users of OpenAI’s paid service for ChatGPT will get full access to GPT-4o, without the feature restrictions that are in place for free users. API a ccess. Developers can access GPT-4o through OpenAI ’ s API. This allows for integration into applications to make full use of GPT-4o ’ s capabilities for tasks.

Developers can access GPT-4o through OpenAI s API. This allows for integration into applications to make full use of GPT-4o s capabilities for tasks. Desktop applications. OpenAI has integrated GPT-4o into desktop applications, including a new app for Apple’s macOS that was also launched on May 13.

OpenAI has integrated GPT-4o into desktop applications, including a new app for Apple’s macOS that was also launched on May 13. Custom GPTs. Organizations can create custom GPT versions of GPT-4o tailored to specific business needs or departments. The custom model can potentially be offered to users via OpenAI’s GPT Store.

Organizations can create custom GPT versions of GPT-4o tailored to specific business needs or departments. The custom model can potentially be offered to users via OpenAI’s GPT Store. Microsoft OpenAI Service. Users can explore GPT-4o’s capabilities in a preview mode within the Microsoft Azure OpenAI Studio, specifically designed to handle multimodal inputs including text and vision. This initial release lets Azure OpenAI Service customers test GPT-4o’s functionalities in a controlled environment, with plans to expand its capabilities in the future.

img]

Maria Diaz/ZDNET

OpenAI unleashed an artificial intelligence (AI) revolution when the company launched ChatGPT for public use in late 2022. Since then, ChatGPT, a chatbot powered by OpenAI’s large language models (LLMs), has dominated headlines and preoccupied the minds of executives running Twitter, Google, Amazon, Microsoft, and Meta, inspiring them to create their own generative AI projects.

Also: ChatGPT vs. Microsoft Copilot vs. Gemini: Which is the best AI chatbot?

Though OpenAI offers a paid ChatGPT Plus subscription for $20 a month, free ChatGPT users can do a lot without the monthly fee. This includes accessing the GPT Store to use custom GPT bots, generating images, enjoying a sense of continuity in their conversations with Memory, uploading photos and documents to discuss them with ChatGPT, having the bot browse the web to give more current context, using advanced data analysis, and access to GPT-4o mini and GPT-4o, OpenAI’s newest and most advanced LLM.

How to use ChatGPT

Maria Diaz/ZDNET

ChatGPT has many capabilities that you can use for free, so we’ll cover how you can access the AI chatbot to try these features yourself.

  1. Go to ChatGPT.com Start by going to ChatGPT.com. You no longer need to create an account on OpenAI’s website to log in and access ChatGPT, but you must create a free account to access GPT-4o, view past conversations, generate images, and upload files. You also need to log in to use ChatGPT Plus. Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it? To create an account, click “Sign up” on the bottom left of the chat screen and follow the prompts to enter your information. OpenAI requires a valid phone number for verification to create an account on its website.

Screenshot by Maria Diaz/ZDNET

  1. Learn how to use ChatGPT Once you open ChatGPT, learn how to navigate the user interface. Here’s a breakdown of what you will see, with the image below as an example: Close sidebar and Chat buttons: On the left side of your screen, there’s a button that closes the sidebar and a ‘New chat’ button that you can click on to start a fresh conversation anytime. ChatGPT remembers what you discussed previously in a conversation and will respond with context. Starting a new chat creates a new discussion without context, though the bot may remember some details if you’re logged in.

On the left side of your screen, there’s a button that closes the sidebar and a ‘New chat’ button that you can click on to start a fresh conversation anytime. ChatGPT remembers what you discussed previously in a conversation and will respond with context. Starting a new chat creates a new discussion without context, though the bot may remember some details if you’re logged in. Chat history: The left sidebar keeps your past conversations accessible in case you need to return to one (you can edit the title of each chat). You can also share your chat history with others, turn off chat history, delete individual chats, or delete your entire chat history. This is also where you can find the GPTs you’ve used in the past.

The left sidebar keeps your past conversations accessible in case you need to return to one (you can edit the title of each chat). You can also share your chat history with others, turn off chat history, delete individual chats, or delete your entire chat history. This is also where you can find the GPTs you’ve used in the past. Account (if logged in): Clicking on your name on the bottom left of your screen gives you access to your account information, including settings, the option to log out, get help, and customize ChatGPT. If you don’t have ChatGPT Plus, you may see an ‘Upgrade plan’ button here to sign up for it.

Clicking on your name on the bottom left of your screen gives you access to your account information, including settings, the option to log out, get help, and customize ChatGPT. If you don’t have ChatGPT Plus, you may see an ‘Upgrade plan’ button here to sign up for it. ChatGPT dropdown menu: You can choose which model to use near the top of the screen, above your conversation. The options in the menu include ChatGPT Plus, ChatGPT, and Temporary Chat. ChatGPT Plus requires a subscription; ChatGPT uses GPT-4o until you reach your rate limit, then automatically switches to GPT-4o mini; and Temporary chat starts a chat that won’t appear in your history, won’t be used in model training, and will have the Memory off.

You can choose which model to use near the top of the screen, above your conversation. The options in the menu include ChatGPT Plus, ChatGPT, and Temporary Chat. ChatGPT Plus requires a subscription; ChatGPT uses GPT-4o until you reach your rate limit, then automatically switches to GPT-4o mini; and Temporary chat starts a chat that won’t appear in your history, won’t be used in model training, and will have the Memory off. Your prompts: The questions or prompts you send the AI chatbot appear in the middle of the chat window, with your account photo or initials to the left.

The questions or prompts you send the AI chatbot appear in the middle of the chat window, with your account photo or initials to the left. ChatGPT’s responses: Whenever ChatGPT responds to your queries, the logo will appear on the left. Below each response, you’ll see Read Aloud, Copy, Regenerate, Thumbs Down, and Change model buttons. You can copy the text to your clipboard to paste it elsewhere and provide feedback on whether the response was accurate. OpenAI uses this human feedback to fine-tune the AI tool through reinforcement learning.

Whenever ChatGPT responds to your queries, the logo will appear on the left. Below each response, you’ll see Read Aloud, Copy, Regenerate, Thumbs Down, and Change model buttons. You can copy the text to your clipboard to paste it elsewhere and provide feedback on whether the response was accurate. OpenAI uses this human feedback to fine-tune the AI tool through reinforcement learning. Text area: This is where you enter your prompts and questions.

This is where you enter your prompts and questions. ChatGPT disclaimer: OpenAI includes some fine print below the text input area. The disclaimer reads: “ChatGPT can make mistakes. Check important info.” OpenAI includes this disclaimer because chatbots like ChatGPT can hallucinate and give nonsensical answers (always make sure to fact-check their responses). This section previously showed the version of the ChatGPT model currently in use, but OpenAI removed this option. Also: How to use ChatGPT to analyze PDFs (and more) for free

As you can see here, I’ve logged in to ChatGPT. This is the new UI launched by OpenAI. Screenshot by Maria Diaz/ZDNET

  1. Start writing your prompts and questions Now that you know how to access ChatGPT, you can ask the chatbot any burning questions and see what answers you get – the possibilities are endless. The ChatGPT tool can be useful in your personal life and many work projects, from software development to writing to translations. Also: 5 ways ChatGPT can help you write an essay Type a ChatGPT prompt in the text bar at the bottom of the page, and click the submit button to pose your questions. The chatbot will then generate text to answer your query. ChatGPT doesn’t work like a search engine. Instead, the chatbot responds with information based on the training data in GPT-4 or GPT-4o. The free version of ChatGPT uses GPT-4o mini and GPT-4o (when available), which is OpenAI’s smartest and fastest model. Like GPT-4o, GPT-4 – accessible through a paid ChatGPT Plus subscription – can access the internet and respond with more up-to-date information.

ChatGPT users have come up with creative ideas for using the chatbot, from asking questions in search of funny answers to correcting a bug in code. Across all these areas, one thing is abundantly clear: this AI tool is remarkable not due to any particular innovations but rather because it’s accessible and easy to use.

How can I use GPT-4o?

GPT-4o is currently available for all ChatGPT users, free and paid. Once you begin using the ChatGPT free tier, you’ll start interacting with GPT-4o without selecting it as your preferred model. Once you hit your rate limit of about 15 prompts per three hours, the model will automatically switch to GPT-4o mini.

OpenAI deprecated older GPT models, leaving only GPT-4, GPT-4 Turbo, GPT-4o, and GPT-4o mini.

What’s the difference between GPT-4o and GPT-4o mini?

GPT-4o mini is an accessible version of GPT-4o, OpenAI’s largest ‘omni’ model. GPT-4o was launched as the largest multimodal model to date, and it can process visual, audio, and text data without resorting to other AI models, like Whisper, as GPT-4 does. GPT-4o mini is a more cost-efficient model than GPT-4o, though still more capable than GPT-3.5, which previously powered the free ChatGPT tier, and GPT-3.5 Turbo.

Also: What does a long context window mean for an AI model, like Gemini?

Aside from giving free ChatGPT users access to a larger, superior model by defaulting to GPT-4o mini rather than GPT-3.5, OpenAI is making GPT-4o mini a more affordable model in the API for developers. The model is 60% cheaper than GPT-3.5 Turbo, features a 128K token context window, with an output of up to 16K tokens per request, and is trained with information up to October 2023.

What’s the difference between free and paid ChatGPT?

Some big differences exist between using the free ChatGPT tier and a Plus subscription. Paid users will have up to five times the limit of free users and can generate more images with DALL-E 3. Subscribers can use GPT-4 when their GPT-4o limit runs out, while free users can use only GPT-4o mini when their quota runs out. ChatGPT Plus users can create GPT bots, while free tier users cannot.

In addition, ChatGPT Plus subscribers have priority access to new features from OpenAI, including the Advanced Voice Mode, which is gradually rolling out now.

What are some good prompts for ChatGPT?

ChatGPT’s responses to prompts are good enough that the technology can be an essential tool for content generation, from writing essays to summarizing a book.

The better the prompt, the better the response you’ll get. Here are examples of prompts you could start with:

How does a computer store and process information?

Analyze this code and tell me how to fix it: [Paste the code].

Write a poem about migraines in Walt Whitman’s style.

What is the difference between a virus and a bacterium?

Write a sick note for my child who has to miss school.

Write a song/poem about [insert topic here] (try adding multiple details).

Give it a list of ingredients from your pantry and ask it to write a recipe with them.

Ask it to summarize ideas or concepts.

Request a packing list for a three-day trip to the beach.

Your imagination is the limit. Have fun with different ChatGPT prompts. For example, ZDNET’s David Gewirtz asked the AI chatbot to write a WordPress plugin and used it to help him fix code faster. He also requested ChatGPT to write a Star Trek script and start a business using the technology and other AI tools.

Also: This simple AI tool may be the most useful new Pixel feature (it’s in iOS 18, too)

Others have used the tool to write malware. ChatGPT is an AI assistant programmed to reject inappropriate requests and doesn’t generate unsafe content, so it may push back if you give it certain potentially unethical requests.

Can I use ChatGPT without a login?

OpenAI allows users to access the free version of ChatGPT, powered by GPT-4o mini and GPT-4o, without logging in, though you must create an account to access your chat history. To access GPT-4, you need an account and a ChatGPT Plus subscription.

Also: My 3 favorite AI chatbot apps for iOS - and what you can do with them

If you’d rather access GPT-4o with a higher rate limit and without logging in to a website, you can use Microsoft Copilot, formerly Bing Chat, which uses OpenAI’s GPT-4o. You can log in with a Microsoft account for extended conversations. GPT-4o is a faster and smarter model than GPT-4.

What is ChatGPT Plus?

ChatGPT Plus is OpenAI’s paid subscription to ChatGPT that offers access to GPT-4o mini, GPT-4, and GPT-4o. A paid subscription gives you priority access to new features, has a higher usage limit than a free account, and lets you use the AI chatbot during peak times.

Also: If these chatbots could talk: The most popular ways people are using AI tools

Is ChatGPT accurate?

ChatGPT and other AI assistants are prone to misinformation because they’re trained on massive amounts of data humans create. These tools can be biased if their data is flawed and can give inaccurate responses, especially regarding world events.

Challenge any incorrect premises and always fact-check information from ChatGPT and other chatbots.

Can I use ChatGPT on my phone?

OpenAI offers mobile apps for iOS and Android. These apps provide a rich ChatGPT experience, letting you talk with ChatGPT through voice conversations powered by OpenAI’s Whisper without requiring a ChatGPT Plus subscription.

Also: These are my 4 favorite AI chatbot apps for Android

If you don’t want to download an app, use the AI-based tool in your mobile browser. The steps to use OpenAI’s ChatGPT from your mobile browser are the same as on a PC. Go to ChatGPT.com and start typing. The AI chatbot should work similarly to when you access it from your computer.

img]

Ryan Daws is a senior editor at TechForge Media with over a decade of experience in crafting compelling narratives and making complex topics accessible. His articles and interviews with industry leaders have earned him recognition as a key influencer by organisations like Onalytica. Under his leadership, publications have been praised by analyst firms such as Forrester for their excellence and performance. Connect with him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)

OpenAI has announced the release of fine-tuning capabilities for its GPT-4o model, a feature eagerly awaited by developers. To sweeten the deal, OpenAI is providing one million free training tokens per day for every organisation until 23rd September.

Tailoring GPT-4o using custom datasets can result in enhanced performance and reduced costs for specific applications. Fine-tuning enables granular control over the model’s responses, allowing for customisation of structure, tone, and even the ability to follow intricate, domain-specific instructions.

Developers can achieve impressive results with training datasets comprising as little as a few dozen examples. This accessibility paves the way for improvements across various domains, from complex coding challenges to nuanced creative writing.

“This is just the start,” assures OpenAI, highlighting their commitment to continuously expand model customisation options for developers.

GPT-4o fine-tuning is available immediately to all developers across all paid usage tiers. Training costs are set at 25 per million tokens, with inference priced at 3.75 per million input tokens and $15 per million output tokens.

OpenAI is also making GPT-4o mini fine-tuning accessible with two million free daily training tokens until 23rd September. To access this, select “gpt-4o-mini-2024-07-18” from the base model dropdown on the fine-tuning dashboard.

The company has collaborated with select partners to test and explore the potential of GPT-4o fine-tuning:

Cosine’s Genie, an AI-powered software engineering assistant, leverages a fine-tuned GPT-4o model to autonomously identify and resolve bugs, build features, and refactor code alongside human developers. By training on real-world software engineering examples, Genie has achieved a state-of-the-art score of 43.8% on the new SWE-bench Verified benchmark, marking the largest improvement ever recorded on this benchmark.

Distyl, an AI solutions provider, achieved first place on the BIRD-SQL benchmark after fine-tuning GPT-4o. This benchmark, widely regarded as the leading text-to-SQL test, saw Distyl’s model achieve an execution accuracy of 71.83%, demonstrating superior performance across demanding tasks such as query reformulation and SQL generation.

OpenAI reassures users that fine-tuned models remain entirely under their control, with complete ownership and privacy of all business data. This means no data sharing or utilisation for training other models.

Stringent safety measures have been implemented to prevent misuse of fine-tuned models. Continuous automated safety evaluations are conducted, alongside usage monitoring, to ensure adherence to OpenAI’s robust usage policies.


回到上一頁