Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
This is critical to how LLMs operate. The tokens generated by the LLM are dependent on the one before it. So if the first token is incorrect, the model will keep generating the wrong response. I am sharing links to two quick experiments, that I performed. Would have loved to share the images and video clips, but the following links will do:1. 3307 is not a prime numberhttps://lnkd.in/dBi-y9ic2. 3307 is a prime numberhttps://lnkd.in/dipkmPamHow do you prevent such hallucinations: COT or Chain-Of-Thought is one of the process used to ground the LLM and make it question its assumptions. In the first instance, I asked the LLM to check its assumptions, and gave it the information that prime numbers cannot be divided by primes smaller than itself. This statement made it to reevaluate its statement, and provide the correct answer. 😀 FYI Anthropomorphism:is the attribution of human traits, emotions, or intentions to non-human entities, including animals, deities, and objects. This term is often used to describe how people perceive inanimate objects or animals as having human-like qualities or characteristics.
3
To view or add a comment, sign in
More Relevant Posts
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
Interesting article, which shows that indiscriminate usage of recursively generated text can generate defects in LLMs. The authors note that the long tail/ or low probability samples begin to disappear leading to a model collapse. Value of data collected from genuine human interactions is far greater and creates better modelshttps://lnkd.in/d-x6i6sk
1
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
Interesting article, which shows that indiscriminate usage of recursively generated text can generate defects in LLMs. The authors note that the long tail/ or low probability samples begin to disappear leading to a model collapse. Value of data collected from genuine human interactions is far greater and creates better models.https://lnkd.in/d-x6i6sk
See AlsoAI Generators, Reviews & Research!Midjourney Review: Is It Still Worth The Hype In 2024? | MathAware: AI Generators, Reviews & Research!Open Source Language Models: A Comprehensive Guide To GPT4All And Alpaca - ExpertBeaconHow to Create an LLM-Powered app to Convert Text to Presentation Slides: GenSlide — A Step-by-step…2
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
LI now has a FB like video section! 🤔🤔
2
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
When do you use Small Language Models as coding assistants? SMLs are useful, when you have proprietery code or data which you don't want to send to an external API (like CoPilot, Claude or OpenAI). They can be set up on a local instance or on your own private cloud. Ensuring privacy and security of your data. DeepSeeker seems to be the model of choice, with models being trained on 2Trillion tokens with over 80+ coding languages. The model is available in various model sizes from 1B to 33B. This can be used for commercial purpose, but the performance can significantly vary with the lower size models. So the "fun" is when you fine-tune the smaller models for your usecase. Relevant links shared in the next comment.
6
1 Comment
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
Performance tip while training computer vision models-- While training locally, always keep your training files(images/videos) on a SSD drive. There is a significant difference in I/O while using the dataloader.
4
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
𝐖𝐡𝐚𝐭 𝐚𝐫𝐞 𝐭𝐡𝐞 𝐛𝐞𝐬𝐭 𝐩𝐫𝐚𝐜𝐭𝐢𝐜𝐞𝐬 𝐟𝐨𝐫 𝐑𝐀𝐆?Use the following methods to optimize your RAG workflow:1. HyDE+Hybrid Search for Retrieval2. monoT5 for ReRanking3. LLM-Embedder for Embeddings4. Milvus as a Vector Database- For fine-tuning use Disturb instead of Random or Normal methods- For Summarization use Recomp (Extractive) and LongLLMlingua (Abstractive)- Use the Reverse method for Repacking These insights are from a paper published by scholars of Fudan University and CSKL. Need to have a more closer look at their published code. At first glance couldn't make sense of it 😐😐😐"Searching for Best Practices in RAG" 📰Paper: https://lnkd.in/dufuut2h🤖Github: https://lnkd.in/d9yzckfh#largelanguagemodels #MPT #generativeai #generativemodels #hallucinations #guardrails #aiforgood #GPT4ALL #Naomic.ai #Openai #llama #RefinedWeb #falconLLM #hallucinations #nlp #languagegenerations #generativemodels #foundationalmodels #foundationmodels #computervision #segmentanything #SAM #FAIR #RAI #ResponsiblityAwareAI #Codex #HumanEval #CoPilot #Github #Nerf #photogrammetry #RealityStudio #Blender #Unity3D #tabulardata #tabulartransformers #huggingface #transformers #scalinglaws
16
2 Comments
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
Can you install FlashAttention on Windows? But first what is FlashAttention? 🔦🔦🌩️🌩️In transformers, self-attention is used to weight the importance of different words or tokens in sequence when making a prediction. This is a major bottleneck -- for larger contexts, since the computation increases with quadriatic complexity. For a sequence of length 𝑛, this results in 𝑛 raised to the power 2 operations. This quadratic scaling becomes a significant bottleneck for long sequences, both in terms of computation and memory usage. It seems that this occurs because of a high number of read-write operations between the GPU's High Bandwidth Memory (HBM) and GPU's on-chip SRAM. By reducing the number of IO-operations, there can be a significant speed-up in calculating the forward and backward passes. This also enables longer contexts in transformers, yielding higher quality models. The speed-up in training time is also in the order of 3-3.5x for GPT-2 medium/small models.The speedup is achieved in FlashAttention by not storing the matrices S and P to HBM, but recomputing the matrices during the backward pass from the blocks of Q,K,V in the SRAM. This is seen as a selective gradient checkpointing. This process of (Tiling) enables the computation of the algorithm in a single CUDA kernel, loading the input from HBM, performing the computation steps and then writing it back to HBM. This is done by dividing the matrices into smaller blocks. 📰Paper: https://lnkd.in/ddH6bAna🤖Github: https://lnkd.in/dDXDry8S (_official implemetation_)To my earlier question -- yes it can be installed on Windows. but its a more convoluted process than simply doing a pip install! And if you are using WSL2 be careful where the weights are stored, sicne it can make a big difference in loading speeds. more in the pipeline. #largelanguagemodels #MPT #generativeai #generativemodels #hallucinations #guardrails #aiforgood #GPT4ALL #Naomic.ai #Openai #llama #RefinedWeb #falconLLM #hallucinations #nlp #languagegenerations #generativemodels #foundationalmodels #foundationmodels #computervision #segmentanything #SAM #FAIR #RAI #ResponsiblityAwareAI #Codex #HumanEval #CoPilot #Github #Nerf #photogrammetry #RealityStudio #Blender #Unity3D #tabulardata #tabulartransformers #huggingface #transformers #scalinglaws
5
1 Comment
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
Our recent open source paper was published in the Nordic Machine Intelligence research publication.
6
Like CommentTo view or add a comment, sign in
-
Aakash Gupta
Aakash Gupta is an Influencer
Builder @Think Evolve | Data Scientist
- Report this post
I have devoted 🙂 considerable time for studying papers related to generative AI. Completed reading slightly more than 30 papers over a period of 6-7 months. Starting from June 2023 to Jan 2024. ⛈ == ☃ It's been an enlightening journey, delving into core concepts and methodologies to gain a deeper understanding of the field. My summarized notes were posted as LI posts ✍ , I have finally collated them on a single page. Lets continue the conversation! 🤝 https://lnkd.in/df53QBwc
11
2 Comments
Like CommentTo view or add a comment, sign in
5,457 followers
- 900 Posts
- 17 Articles
View Profile
FollowMore from this author
- Deepfake Detection Challenge 2020 Aakash Gupta 4y
- GLUE Benchmark - The Labours of Hercules Aakash Gupta 4y
Explore topics
- Sales
- Marketing
- Business Administration
- HR Management
- Content Management
- Engineering
- Soft Skills
- See All