Monday, December 23, 2024

Living the future | Dialogue | thenews.com.pk

Must read


s we airily go about our daily lives, generative artificial intelligence (GenAI) continues to seep into our daily lives with tech giants developing and evolving their own tools. Not long ago, MetaAI was introduced in WhatsApp. It was meant to be one’s virtual best friend. One could ask it anything and get all sorts of answers. We now have Gemini AI which has given a whole new meaning to the Google search engine.

Tech giants are integrating AI into their platforms in various ways, each with unique approaches and goals. Meta AI focuses on building a general-purpose AI that can perform various tasks, like chatbots and image recognition. It adopts the approach of developing a unified AI framework that can be applied across multiple platforms of the Metaverse including Facebook, Instagram and WhatsApp.

Gemini by Google enhanced the search capabilities with AI-powered features like natural language processing and knowledge graph enhancements. It sophisticatedly integrates AI into the Google search engine to improve result accuracy and user experience.

Chat GPT from OpenAI, which focused on developing a general-purpose language model that can generate human-like text and perform various tasks, is increasingly popular. It was seen as training large language models on vast datasets to generate coherent and context-specific text.

Amazon’s Alexa is built on the virtual assistant model that can perform tasks, answer questions and control smart home devices. Using natural language processing and machine learning, it improves its capabilities to meet the user’s needs.

Samsung’s Bixby similarly creates a virtual assistant that integrates with Samsung devices and offers personalised experiences. It uses AI to learn user behaviour and preferences, enhancing Bixby’s suggestions and recommendations.

Each of these AI platforms has a unique purpose. From a general-purpose AI i.e. Meta to search enhancement i.e. Gemini, to virtual assistants like the trusty Alexa and wise Bixby, the approach to AI development varies. Some focusing on unified frameworks such as Meta and others on specific applications, for example GPT. The scope of the AI integration in each of these applications differs. Some, like Meta, aim for broad platform integration; others focus on specific features.

Gemini, developed by Google, is a unique AI system that stands out from other AI applications in several ways. It is specifically designed to enhance search capabilities, unlike general-purpose AI applications like Meta AI or GPT. Its primary focus is to improve the search engine’s understanding of natural language and generate more accurate results. This new GenAI is deeply integrated with Google’s Knowledge Graph, which contains a vast amount of structured data on entities, relationships and concepts. This integration enables Gemini to provide more informative and contextually relevant search results.

Gemini employs cutting-edge NLP techniques, such as semantic search and query understanding, to better comprehend the nuances of human language. This allows it to handle complex searches and provide more precise answers. It is designed to understand the context of a search query, taking into account factors like user location, search history and preferences. This contextual understanding enables it to provide more personalised and relevant results.

Gemini supports multimodal search, allowing users to search using images, videos or audio, in addition to text-based queries. This feature leverages Google’s advanced computer vision and audio processing capabilities. It is built for real-time processing, enabling it to handle massive volumes of searches simultaneously, while providing rapid and accurate results.

Gemini’s architecture is designed for scalability, allowing it to handle the massive search volume Google processes daily, while maintaining high performance and accuracy. It is designed to learn from user interactions and feedback, enabling it to improve its search results and accuracy over time.

By focusing on search enhancement and integrating with Google’s Knowledge Graph, Gemini provides a powerful search experience. For example, “What is the capital of France?” on Google yields a direct answer, “Paris,” and additional information including maps, weather reports, etc thanks to Gemini’s integration with the Knowledge Graph.

Meta AI can generate human-like text responses to user queries, like a chat-bot. Like a writing assistant, Chat GPT can generate a short story or article based on a prompt.

These AI tools demonstrate various approaches and specialisations, showcasing the diverse applications of AI technology. Gemini, is user-friendly and provides content creators with a plethora of exciting features. It can generate three alternative responses for each prompt, thereby increasing the probability of identifying an output that is in accordance with your specific needs. If the responses do not satisfy the expectations, they can be easily regenerated. Custom instructions or pre-set commands can be used to highlight and regenerate specific sections of text. Additionally a dedicated menu allows users to customise the length and tone of their responses, allowing them to choose between a more casual or professional tone. Users can also adjust the intended word count. This is something Chat GPT inherently lacks.

The cherry on the top is that Gemini enables direct export to Google Docs or insertion into Gmail, thereby facilitating efficient workflows. Another feature allows for the comparison of the generated text with the search engine results, emphasising areas of similarity or potential factual discrepancies. Like Chat GPT, at present, the platform only permits the rewriting of the most recent prompt. Previous inputs are not accessible for revision. The alternative draft options are no longer available for review after a response has been selected and reworked.

Gemini is an exceptional tool for content generation. It is precise, adaptable and facilitates substantial time savings. Although some modifications must be made when utilising its capabilities, these are readily adaptable for a majority of individuals and do not diminish the tool’s overall value. The capacity to revisit and revise previous prompts in a conversation is a significant advantage. The model’s prototype responses can be effortlessly navigated. This enables a refined and iterative writing process.

The platform’s capability is considerably enhanced by the extensive library of community-developed and partner-provided plug-ins in OpenAI’s robust plug-in ecosystem. These instruments can be indispensable for productivity optimisation, data analysis and research. However, the absence of granular modifying functions in the output is a significant drawback. The current lack of capacity to export content directly in a variety of formats, particularly tables, can be a burden. Although there are remedies, such as the ability to copy tables as images or text, these may not always be seamless.


The writer is the CEO at ZAK Casa and Verde as well as a managing partner at a law firm, namely Lex Mercatoria

Latest article