Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (2024)

Aakash Gupta

Aakash Gupta is an Influencer

Builder @Think Evolve | Data Scientist

  • Report this post

This is critical to how LLMs operate. The tokens generated by the LLM are dependent on the one before it. So if the first token is incorrect, the model will keep generating the wrong response. I am sharing links to two quick experiments, that I performed. Would have loved to share the images and video clips, but the following links will do:1. 3307 is not a prime numberhttps://lnkd.in/dBi-y9ic2. 3307 is a prime numberhttps://lnkd.in/dipkmPamHow do you prevent such hallucinations: COT or Chain-Of-Thought is one of the process used to ground the LLM and make it question its assumptions. In the first instance, I asked the LLM to check its assumptions, and gave it the information that prime numbers cannot be divided by primes smaller than itself. This statement made it to reevaluate its statement, and provide the correct answer. 😀 FYI Anthropomorphism:is the attribution of human traits, emotions, or intentions to non-human entities, including animals, deities, and objects. This term is often used to describe how people perceive inanimate objects or animals as having human-like qualities or characteristics.

3

Like Comment

To view or add a comment, sign in

More Relevant Posts

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    Interesting article, which shows that indiscriminate usage of recursively generated text can generate defects in LLMs. The authors note that the long tail/ or low probability samples begin to disappear leading to a model collapse. Value of data collected from genuine human interactions is far greater and creates better modelshttps://lnkd.in/d-x6i6sk

    AI models collapse when trained on recursively generated data - Nature nature.com

    1

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    Interesting article, which shows that indiscriminate usage of recursively generated text can generate defects in LLMs. The authors note that the long tail/ or low probability samples begin to disappear leading to a model collapse. Value of data collected from genuine human interactions is far greater and creates better models.https://lnkd.in/d-x6i6sk

    AI models collapse when trained on recursively generated data - Nature nature.com

    2

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    LI now has a FB like video section! 🤔🤔

    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (10)

    2

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    When do you use Small Language Models as coding assistants? SMLs are useful, when you have proprietery code or data which you don't want to send to an external API (like CoPilot, Claude or OpenAI). They can be set up on a local instance or on your own private cloud. Ensuring privacy and security of your data. DeepSeeker seems to be the model of choice, with models being trained on 2Trillion tokens with over 80+ coding languages. The model is available in various model sizes from 1B to 33B. This can be used for commercial purpose, but the performance can significantly vary with the lower size models. So the "fun" is when you fine-tune the smaller models for your usecase. Relevant links shared in the next comment.

    6

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    𝐖𝐡𝐚𝐭 𝐚𝐫𝐞 𝐭𝐡𝐞 𝐛𝐞𝐬𝐭 𝐩𝐫𝐚𝐜𝐭𝐢𝐜𝐞𝐬 𝐟𝐨𝐫 𝐑𝐀𝐆?Use the following methods to optimize your RAG workflow:1. HyDE+Hybrid Search for Retrieval2. monoT5 for ReRanking3. LLM-Embedder for Embeddings4. Milvus as a Vector Database- For fine-tuning use Disturb instead of Random or Normal methods- For Summarization use Recomp (Extractive) and LongLLMlingua (Abstractive)- Use the Reverse method for Repacking These insights are from a paper published by scholars of Fudan University and CSKL. Need to have a more closer look at their published code. At first glance couldn't make sense of it 😐😐😐"Searching for Best Practices in RAG" 📰Paper: https://lnkd.in/dufuut2h🤖Github: https://lnkd.in/d9yzckfh#largelanguagemodels #MPT #generativeai #generativemodels #hallucinations #guardrails #aiforgood #GPT4ALL #Naomic.ai #Openai #llama #RefinedWeb #falconLLM #hallucinations #nlp #languagegenerations #generativemodels #foundationalmodels #foundationmodels #computervision #segmentanything #SAM #FAIR #RAI #ResponsiblityAwareAI #Codex #HumanEval #CoPilot #Github #Nerf #photogrammetry #RealityStudio #Blender #Unity3D #tabulardata #tabulartransformers #huggingface #transformers #scalinglaws

    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (18)
    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (19)

    16

    2 Comments

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    Can you install FlashAttention on Windows? But first what is FlashAttention? 🔦🔦🌩️🌩️In transformers, self-attention is used to weight the importance of different words or tokens in sequence when making a prediction. This is a major bottleneck -- for larger contexts, since the computation increases with quadriatic complexity. For a sequence of length 𝑛, this results in 𝑛 raised to the power 2 operations. This quadratic scaling becomes a significant bottleneck for long sequences, both in terms of computation and memory usage. It seems that this occurs because of a high number of read-write operations between the GPU's High Bandwidth Memory (HBM) and GPU's on-chip SRAM. By reducing the number of IO-operations, there can be a significant speed-up in calculating the forward and backward passes. This also enables longer contexts in transformers, yielding higher quality models. The speed-up in training time is also in the order of 3-3.5x for GPT-2 medium/small models.The speedup is achieved in FlashAttention by not storing the matrices S and P to HBM, but recomputing the matrices during the backward pass from the blocks of Q,K,V in the SRAM. This is seen as a selective gradient checkpointing. This process of (Tiling) enables the computation of the algorithm in a single CUDA kernel, loading the input from HBM, performing the computation steps and then writing it back to HBM. This is done by dividing the matrices into smaller blocks. 📰Paper: https://lnkd.in/ddH6bAna🤖Github: https://lnkd.in/dDXDry8S (_official implemetation_)To my earlier question -- yes it can be installed on Windows. but its a more convoluted process than simply doing a pip install! And if you are using WSL2 be careful where the weights are stored, sicne it can make a big difference in loading speeds. more in the pipeline. #largelanguagemodels #MPT #generativeai #generativemodels #hallucinations #guardrails #aiforgood #GPT4ALL #Naomic.ai #Openai #llama #RefinedWeb #falconLLM #hallucinations #nlp #languagegenerations #generativemodels #foundationalmodels #foundationmodels #computervision #segmentanything #SAM #FAIR #RAI #ResponsiblityAwareAI #Codex #HumanEval #CoPilot #Github #Nerf #photogrammetry #RealityStudio #Blender #Unity3D #tabulardata #tabulartransformers #huggingface #transformers #scalinglaws

    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (23)
    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (24)
    • Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (25)

    5

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    Our recent open source paper was published in the Nordic Machine Intelligence research publication.

    6

    Like Comment

    To view or add a comment, sign in

  • Aakash Gupta

    Aakash Gupta is an Influencer

    Builder @Think Evolve | Data Scientist

    • Report this post

    I have devoted 🙂 considerable time for studying papers related to generative AI. Completed reading slightly more than 30 papers over a period of 6-7 months. Starting from June 2023 to Jan 2024. ⛈ == ☃ It's been an enlightening journey, delving into core concepts and methodologies to gain a deeper understanding of the field. My summarized notes were posted as LI posts ✍ , I have finally collated them on a single page. Lets continue the conversation! 🤝 https://lnkd.in/df53QBwc

    large-language-models-30-papers-that-matter https://www.thinkevolveconsulting.com

    11

    2 Comments

    Like Comment

    To view or add a comment, sign in

Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (35)

Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (36)

5,457 followers

  • 900 Posts
  • 17 Articles

View Profile

Follow

More from this author

  • Deepfake Detection Challenge 2020 Aakash Gupta 4y
  • GLUE Benchmark - The Labours of Hercules Aakash Gupta 4y

Explore topics

  • Sales
  • Marketing
  • Business Administration
  • HR Management
  • Content Management
  • Engineering
  • Soft Skills
  • See All
Aakash Gupta on LinkedIn: This is critical to how LLMs operate. The tokens generated by the LLM are… (2024)
Top Articles
Rich Man Logan: A look into the net worth of Hugh Jackman and his outstanding career
30 famous yellow cartoon characters of all time ranked
Strange World Showtimes Near Amc Brazos Mall 14
Jody Plauche Wiki
Fnv Mr Cuddles
Memphis Beauty 2084
Restaurants Near Defy Trampoline Park
Mets Game Highlights
On Trigger Enter Unity
Wgu Academy Phone Number
Allegra Commercial Actress 2022
Rimworld Prison Break
Tinyzonetv.to Unblocked
Nyu Paralegal Program
Rhiel Funeral Durand
Okay Backhouse Mike Lyrics
Waitlistcheck Sign Up
Tethrd Coupon Code The Hunting Public
10425 Reisterstown Rd
Omaha Steaks Molten Lava Cake Instructions
Vegamovies Marathi
Twitter claims there’s “no evidence” 200 million leaked usernames and email addresses came from an exploit of its systems
Oh The Pawsibilities Salon & Stay Plano
25+ Twitter Header Templates & Design Tips - Venngage
R Toronto Blue Jays
Milwaukee Nickname Crossword Clue
Lil Coffea Shop 6Th Ave Photos
Произношение и транскрипция английских слов онлайн.
Qcp Lpsg
Acnh Picnic Table
Wo liegt Sendenhorst? Lageplan und Karte
Phasmophobia Do As I Command Challenge
Late Bloomers Summary and Key Lessons | Rich Karlgaard
Walgreens Rufe Snow Hightower
Hospice Thrift Store St Pete
Warrior Badge Ability Wars
Craigslist Pinellas County Rentals
Smarthistory – Leonardo da Vinci, “Vitruvian Man”
Scarabaeidae), with a key to related species – Revista Mexicana de Biodiversidad
Rage Of Harrogath Bugged
Limestone Bank Hillview
Congdon Heart And Vascular Center
Z93 Local News Monticello Ky
Incident Manager (POS & Kiosk) job in Chicago, IL with McDonald's - Corporate
4225 Eckersley Way Roseville Ca
Green Press Gazette Obits
4Myhr Mhub
Busted Newspaper Lynchburg County VA Mugshots
Akc Eo Tryouts 2022
Transportationco.logisticare
C Weather London
Latest Posts
Article information

Author: Lidia Grady

Last Updated:

Views: 5835

Rating: 4.4 / 5 (65 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Lidia Grady

Birthday: 1992-01-22

Address: Suite 493 356 Dale Fall, New Wanda, RI 52485

Phone: +29914464387516

Job: Customer Engineer

Hobby: Cryptography, Writing, Dowsing, Stand-up comedy, Calligraphy, Web surfing, Ghost hunting

Introduction: My name is Lidia Grady, I am a thankful, fine, glamorous, lucky, lively, pleasant, shiny person who loves writing and wants to share my knowledge and understanding with you.