The Humans in the Loop: Big News for Little LLMs

This Week in AI for Devs: New Small Model Advancements

May 02, 2024

Welcome to The Humans in the Loop, your executive summary of AI news for devs. Know someone who might want to receive this? Email us at thehumansintheloop@heavybit.com.

Our top story: OSS and Enterprises Hit New Small-Model Milestones

Last year, there was a race to train models with as many billions of parameters as possible. Today, the race is to do more with less—especially in the 3B parameter range. AAPL, which is rumored to launch its first consumer AI offering with the next rev of iOS, has launched OpenELM, a new suite of open-source text-generation models optimized for deploying on-device. MSFT has launched its smallest model yet, the 3.8B-parameter Phi-3-mini, which can support a hefty context length of 128K tokens. Not to be outdone, startup Gradient AI’s latest 3B model Llama-3 8B Gradient Instruct 1048k supports a context length of a whopping 1M tokens. Maybe small-but-mighty is the next big step in language models?

And now, the most important AI headlines and why they matter for devs.

💻 Development

Toolkit: Cohere’s AI Tools for Building Enterprise AI Apps
In addition to its commercial RAG and search products, AI startup Cohere has released a new free toolkit to build AI apps faster. Try it here.

[GH] OpenLIT - GenAI and LLM Observability Platform
This OTel-native project lets teams add LLM observability “with just a single line of code,” joining more-heavyweight projects such as langfuse, phoenix, and traceloop.

Running Open LLMs? There Are Apps for That
Mozilla’s llamafile makes hosting open LLMs on your own rig easy. Jan.ai lets you host on your Mac. Need more compute? Replicate will host your open model for you.

Updated Codegen Model: StarCoder-2-15B-Instruct
The latest StarCoder hits an impressive rating of 72.6 on humaneval (which is designed to evaluate codegen LLMs). Try it here.

Guide: Build an Email Assistant With Burr, GPT4, and FastAPI
This guide walks through how to use Burr, an OSS framework for building AI apps from DAGWorks, with GPT4 and FastAPI to build a working email app.

🤔 Interesting AI Projects, Research, and Updates

Research: NExT - Reasoning About Code Execution
GOOG Deepmind researchers’ new self-tuning methodology reportedly helps models noticeably improve the fix rate of codegen.

[GH] Pebblo - Compliance and Sec for GenAI Apps
This project takes care of sec and compliance to help you deploy GenAI apps faster.

Research: AI Model Disgorgement
Model trained on bad/copyrighted data? AWS researchers’ sharding method may “remove the effects” of training on undesired datasets.

In-Depth Report: LLM Safety Challenges
Pack a lunch and settle in for this comprehensive report, which covers 200+ modern sec, privacy, and safety issues with LLMs.

💼 Hiring and Community

Startups Hiring This Week:
- Founding AI Engineer @ Second
- Senior AI Software Engineer @ Dashworks
- Data Stream Processing Eng, Apache Flink @ Promoted.ai

Mid-Market Companies Hiring This Week:
- Data Engineer Verana Health
- AI Product Eng Lead Hex Technologies

Enterprises Hiring This Week:
- Founding Data Scientist OpenAI
- AI Product Manager MSFT
- Software Eng, Big Data & AI AAPL

💡 Spotlight: Newly-Launched AI Startups This Week

- TrojAI: $5.75M from Flying Fish, Alteryx, Flybridge for enterprise AI sec
- SydeLabs: $2.5M from RTP Global, Picus Capital for AI cybersec
- Prophet: $11M from Bain Capital for AI-synthesized threat alerts
- FlexAI: $30M from AIC, Elaia, Heartcore for AI infra
- Prime Intellect: $5M from Distributed Global, Coinfund for decentralized compute

🏭 Industry: M&A, Launches, Trends

NVDA Acquires Run.AI and Deci to Mitigate Compute Needs
LLMs need tons of compute and NVDA is the leader in hardware to provide horsepower. The chip giant makes 2x acquisitions to mitigate workloads.

AAPL Quietly Acquires Computer Vision Startup Datakalab
AAPL apparently snapped up the French computer image analysis startup, possibly to support its long-awaited first-party AI offerings.

GPT4 Edged Out by China’s SenseNova 5.0
Chinese firm SenseTime’s 600B model beats GPT4 in most benchmarks, boasting a 200K-token window. Look out, Claude-3 Opus.

Snowflake Arctic: 480B Enterprise OSS Model for SQL and Coding
The Apache 2.0-licensed model reportedly “excels at enterprise tasks such as SQL generation, coding, and instruction.” Try it here.

⚖️ Copyright, IP, Licensing, and Regulation

US Federal AI Advisory Board Has a Bunch of AI CEOs
Regulation accelerates with DHS veterans…and the CEOs of GOOG, OpenAI, NVDA, and Anthropic. Who will be totally objective.

GOOG Urges Updated US Immigration Policy for AI Talent
We’ve reported that China is winning the AI talent war, leading GOOG to push for updated policies to bring in more AI experts from abroad.

That’s all for this edition—we’ll see you in another two weeks with more updates. In the meantime, please feel free to follow us on Twitter and LinkedIn.

The Humans in the Loop

The Humans in the Loop: Big News for Little LLMs

This Week in AI for Devs: New Small Model Advancements

Discussion about this post