All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
ibm.com
Faster LLMs: Accelerate Inference with Speculative Decoding
Isaac Ke explains speculative decoding, a technique that accelerates LLM inference speeds by 2-4x without compromising output quality. Learn how "draft and verify" pairs smaller and larger models to optimize token generation, GPU usage, and resource efficiency.
9 months ago
Fast Inference from Transformers via Speculative Decoding Transformer Models
Speculative Decoding — Think Fast⚡, Then Think Right✅
substack.com
11 months ago
41:28
Transformer decoders explained step-by-step from scratch
MSN
Learn With Jay
2 months ago
1:25
Transformer Visualization From Text To Prediction
YouTube
Karela Technologies
10 views
3 months ago
Top videos
0:18
Introducing LM Studio 0.3.10 with 🔮 Speculative Decoding!It's an LLM inferencing technique that can speed up token generation by up to 1.5x-3x in some cases 🏎️💨- Supported for both GGUF and… | LM Studio | 10 comments
linkedin.com
10 views
Feb 19, 2025
23:40
Speculative Speculative Decoding: How to Parallelize Drafting and ... for 2x Faster LLM Inference
YouTube
Xiaol.x
1 week ago
5:13
Speculative Speculative Decoding
YouTube
Intellectually Curious Podcast
2 views
2 weeks ago
Fast Inference from Transformers via Speculative Decoding NLP Inference Speedup
DFlash Boosts Speculative Decoding with Lightweight Block Diffusion | Kalyan KS posted on the topic | LinkedIn
linkedin.com
2 views
2 months ago
Natural Language Processing: NLP With Transformers in Python
git.ir
29.3K views
Oct 19, 2022
1:06
Matt Johnson vs Scott Thornton Mar 9, 2004
YouTube
hockeyfights.com
9.2K views
Mar 9, 2004
0:18
Introducing LM Studio 0.3.10 with 🔮 Speculative Decoding!It's an LLM i
…
10 views
Feb 19, 2025
linkedin.com
23:40
Speculative Speculative Decoding: How to Parallelize Drafting and ... f
…
1 week ago
YouTube
Xiaol.x
5:13
Speculative Speculative Decoding
2 views
2 weeks ago
YouTube
Intellectually Curious Podcast
0:44
Speculative Decoding: The inference technique that will chan
…
649 views
Feb 23, 2025
YouTube
Devansh: Chocolate Milk Cult Leader
How to Quadruple LLM Decoding Performance with Speculative Dec
…
Aug 1, 2024
qualcomm.com
Speculative Decoding — Think Fast⚡, Then Think Right✅
11 months ago
substack.com
6:18
What is Speculative Sampling? | Boosting LLM inference speed
3.9K views
Nov 20, 2024
YouTube
AssemblyAI
15:00
Llm speculative decoding 이란
Mar 10, 2025
pornmaven.com
11:34
Generate 10 Tokens At Once - Faster LLM INFERENCE - AdaSPE
…
480 views
4 months ago
YouTube
Vuk Rosić
DFlash Boosts Speculative Decoding with Lightweight Block
…
2 views
2 months ago
linkedin.com
1:23
Speculative Speculative Decoding for Faster LLM Inference
1.3K views
1 week ago
YouTube
Rajistics - data science, AI, and machine learning
1:41
How to speed up AI without new hardware
1K views
5 months ago
YouTube
Red Hat
37:34
Speculative Decoding Explained
7.8K views
Dec 21, 2023
YouTube
Trelis Research
14:37
Understanding Speculative Decoding: Boosting LLM Efficienc
…
427 views
11 months ago
YouTube
MLWorks
17:56
Behind the Stack, Ep 11 - Speculative Decoding
70 views
4 months ago
YouTube
Doubleword
2:27:59
COLING 2025 Tutorial: Speculative Decoding for Efficient LLM Inference
398 views
Jan 23, 2025
bilibili
云安Ann
24:17
Fast Inference from Transformers via Speculative Decoding
1.2K views
Sep 12, 2023
YouTube
Arxiv Papers
8:44
How to PROPERLY Use Speculative Decoding in LM Studio to DOUBL
…
980 views
1 month ago
YouTube
AsapGuide
0:46
Speculative Decoding Turbocharge Your LLM Inference! #ai, #llm, #de
…
66 views
1 month ago
YouTube
The Code Architect
40:31
CS 886 | Lecture 13 Efficient LLM Inference | PABEE, CALM and Spe
…
1.2K views
Mar 3, 2024
YouTube
Rushabh Solanki
12:56
Weekly Expiry Rationalisation Could Curb Speculative Trading: Dr Ven
…
7 months ago
MSN
BT TV
4:55
Saguaro: 5x Faster LLM Inference with SSD
41 views
2 weeks ago
YouTube
AI Research Roundup
Distributed Speculative Execution: A Programming Model for Reliabili
…
Oct 31, 2009
Microsoft
0:36
How AI Replies So Fast! ⚡ Speculative Decoding
164 views
2 months ago
YouTube
Mr. Doubty – Short. Smart. Techy
12:46
Speculative Decoding: When Two LLMs are Faster than One
31.4K views
Oct 12, 2023
YouTube
Efficient NLP
7:40
Speculative Decoding: 3× Faster LLM Inference with Zero Quality L
…
709 views
2 months ago
YouTube
Tales Of Tensors
1:08:32
LLM推理加速新范式!推测解码(Speculative Decoding)最新综述
3.2K views
Mar 2, 2024
bilibili
NICE学术
7:00
Speculative Decoding with OpenVINO | Intel Software
196.9K views
8 months ago
YouTube
Intel Software
4:39
DFlash: Faster LLM Inference via Block Diffusion
30 views
1 month ago
YouTube
AI Research Roundup
See more videos
More like this
Feedback