Bypassing YouTube video download throttling

Have you ever tried to download movies from YouTube? I imply manually with out counting on software program like youtube-dl, yt-dlp or certainly one of “these” web sites. It’s rather more sophisticated than you would possibly assume.

Youtube generates income from consumer advert views, and it’s logical for the platform to implement restrictions to stop individuals from downloading movies and even watching them on an unofficial shopper like YouTube Vanced. In this text, I’ll clarify the technical particulars of those safety mechanisms and the way it’s attainable to bypass them.

A google seek for: youtube downloader

The first step is to seek out the precise URL containing the video file. For this, we will talk with the YouTube API. Specifically, the /youtubei/v1/participant endpoint permits us to retrieve all the small print of a video, such because the title, description, thumbnails, and most significantly, the codecs. It is inside these codecs that we will find the URL of the file based mostly on the specified high quality (SD, HD, UHD, and so on.).

Here’s an instance for the video with the ID aqz-KE-bpKQ, the place we’ll get the URL for one of many format. Note that the opposite variables contained inside the context object are preconditions validated by the API. The accepted values have been discovered by observing the requests despatched by the online browser.

echo-n'{"videoId":"aqz-KE-bpKQ","context":{"client":{"clientName":"WEB","clientVersion":"2.20230810.05.00"}}}' | 
  http publish 'https://www.youtube.com/youtubei/v1/player' |
  jq -r'.streamingData.adaptiveFormats[0].url'

https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691828966&ei=hu7WZOCJHI7T8wTSrr_QBg [TRUNCATED]

However, making an attempt to download from this URL results in actually gradual download:

http --print=b --download'https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691828966&ei=hu7WZOCJHI7T8wTSrr_QBg [TRUNCATED]'

Downloading to videoplayback.webm
[ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ]   0% ( 0.0/1.5 GB ) 6:23:45 66.7 kB/s

The pace is all the time restricted to round 40-70kB/s. Unfortunately for this 10-minute video, it will take roughly 6 and a half hours to download your entire video. Clearly, the video will not be downloaded at this pace when utilizing an online browser. So what’s totally different?

Here’s the entire URL damaged down. It’s moderately sophisticated, however there’s a particular parameter that pursuits us.

Protocol:httpsHostname:rr1---sn-8qu-t0aee.googlevideo.comPath title:/videoplaybackQuery Parameters:expire:1691829310ei:3u_WZJT7Cbag_9EPn7mi0A8ip:203.0.113.30id:o-ABGboQn9qMKsUdClvQHd6cHm6l1dWkRw4WNj3V7wBgY1itag:315aitags:133,134,135,136,160,242,243,244,247,278,298,299,302,303,308,315,394,395,396,397,398,399,400,401supply:youtuberequiressl:suremh:aPmm:31,29mn:sn-8qu-t0aee,sn-t0a7ln7dms:au,rdumv:mmvi:1pcm2cms:surepl:18initcwndbps:1422500spc:UWF9fzkQbIbHWdKe8-ahg0uWbE_UrbUM0U6LbQfFxgvprv:1svpuc:1mime:video/webmns:dn5MLRkBtM4BWwzNNOhVxHIPgir:sureclen:1536155487dur:634.566lmt:1662347928284893mt:1691807356fvip:3keepalive:surefexp:24007246,24363392c:WEBtxp:553C434n:mAq3ayrWqdeV_7wbIgPsparams:expire,ei,ip,id,aitags,supply,requiressl,spc,vprv,svpuc,mime,ns,gir,clen,dur,lmtsig:AOq0QJ8wRgIhAOx29gNeoiOLRe1GhEfE52PAiXW64ZEWX7nNdAiJE6ezAiEA0Plw6Yn0kmSFFZHO2JZPZyMGd0O-gEblUXPRrexQgrY=lsparams:mh,mm,mn,ms,mv,mvi,pcm2cms,pl,initcwndbpslsig:AG3C_xAwRQIgZVOkDl4rGPGnlK6IGCAXpzxk-cB5RRFmXDesEqOWTRoCIQCzIdPKE6C6_JQVpH6OKMF3woIJ2yVYaztT9mXIVtE6xw==

Since mid-2021, YouTube has included the question parameter n within the majority of file URLs. This parameter must be reworked utilizing a JavaScript algorithm positioned within the file base.js, which is distributed with the online web page. YouTube makes use of this parameter as a problem to confirm that the download originates from an “official” shopper. If the problem will not be resolved and n will not be reworked appropriately, YouTube will silently apply throttling to the video download.

The JavaScript algorithm is obfuscated and adjustments ceaselessly, so it’s not sensible to aim reverse engineering to know it. The resolution is solely to download the JavaScript file, extract the algorithm code, and execute it by passing the n parameter to it. The following code accomplishes this.

importaxiosfrom'axios';importvmfrom'vm'constvideoId='aqz-KE-bpKQ';/**
 * From the Youtube API, retrieve metadata in regards to the video (title, video format and audio format)
 */asyncoperateretrieveMetadata(videoId){constresponse=awaitaxios.publish('https://www.youtube.com/youtubei/v1/player',{"videoId":videoId,"context":{"client":{"clientName":"WEB","clientVersion":"2.20230810.05.00"}}});constcodecs=response.information.streamingData.adaptiveFormats;return[response.data.videoDetails.title,formats.filter(w=>w.mimeType.startsWith("video/webm"))[0],codecs.filter(w=>w.mimeType.beginsWith("audio/webm"))[0],];}/**
 * From the Youtube Web Page, retrieve the problem algorithm for the n question parameter
 */asyncoperateretrieveChallenge(video_id){/**
     * Find the URL of the javascript file for the present participant model
     */asyncoperateretrieve_player_url(video_id){letresponse=awaitaxios.get('https://www.youtube.com/embed/'+video_id);letplayer_hash=//s/participant/(w+)/player_ias.vflset/w+/base.js/.exec(response.information)[1]return`https://www.youtube.com/s/player/${player_hash}/player_ias.vflset/en_US/base.js`}constplayer_url=awaitretrieve_player_url(video_id);constresponse=awaitaxios.get(player_url);letchallenge_name=/.get("n"))&&(b=([a-zA-Z0-9$]+)(?:[(d+)])?([a-zA-Z0-9])/.exec(response.information)[1];challenge_name=newRegExp(`var ${challenge_name}s*=s*[(.+?)]s*[,;]`).exec(response.information)[1];constproblem=newRegExp(`${challenge_name}s*=s*operates*(([w$]+))s*{(.+?}s*return [w$]+.be a part of(""))};`,"s").exec(response.information)[2];returnproblem;}/**
 * Solve the problem and substitute the n question parameter from the url
 */operatesolveChallenge(problem,formatUrl){consturl=newURL(formatUrl);constn=url.searchParams.get("n");constn_transformed=vm.runInNewContext(`((a) => {${problem}})('${n}')`);url.searchParams.set("n",n_transformed);returnurl.toString();}const[title,video,audio]=awaitretrieveMetadata(videoId);constproblem=awaitretrieveChallenge(videoId);video.url=solveChallenge(problem,video.url);audio.url=solveChallenge(problem,audio.url);console.log(video.url);

With this new URL containing the appropriately reworked n parameter, the following step is to download the video. However, YouTube nonetheless enforces a throttling rule. This rule imposes a variable download pace restrict based mostly on the scale and size of the video, aiming to supply a download time that’s roughly half the period of the video. This aligns with the streaming nature of movies. It could be an enormous waste of bandwidth for YouTube to all the time present the media file as rapidly as attainable.

http --print=b --download'https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691888702&ei=3tfXZIXSI72c_9EP1NGHqA8 [TRUNCATED]'

Downloading to videoplayback.webm
[ ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ]   4% ( 0.1/1.5 GB ) 0:06:07 4.0 MB/s

To bypass this limitation, we will break the download into a number of smaller components utilizing the HTTP Range header. This header means that you can specify which a part of the file you wish to download with every request (eg: Range bytes=2000-3000). The following code implements this logic.

/**
 * Download a media file by breaking it into a number of 10MB segments
 */asyncoperatedownload(url,size,file){constMEGABYTE=1024*1024;awaitfs.guarantees.rm(file,{drive:true});letdownloadedBytes=0;whereas (downloadedBytes<size){letnextSegment=downloadedBytes+10*MEGABYTE;if (nextSegment>size)nextSegment=size;// Download phaseconstbegin=Date.now();letresponse=awaitaxios.get(url,{headers:{"Range":`bytes=${downloadedBytes}-${nextSegment}`},responseType:'stream'});// Write phaseawaitfs.guarantees.writeFile(file,response.information,{flag:'a'});constfinish=Date.now();// Print download statsconstprogress=(nextSegment/size*100).toFixed(2);constcomplete=(size/MEGABYTE).toFixed(2);constpace=((nextSegment-downloadedBytes)/(finish-begin)*1000/MEGABYTE).toFixed(2);console.log(`${progress}% of ${complete}MB at ${pace}MB/s`);downloadedBytes=nextSegment+1;}}

This works as a result of the throttling rule takes a while to use, and the small segments are downloaded very quickly, all the time using a brand new connection.

node index.js

0.68% of 1464.99MB at 46.73MB/s
1.37% of 1464.99MB at 60.98MB/s
2.05% of 1464.99MB at 71.94MB/s
2.73% of 1464.99MB at 70.42MB/s
3.41% of 1464.99MB at 68.49MB/s
4.10% of 1464.99MB at 68.97MB/s
4.78% of 1464.99MB at 74.07MB/s
5.46% of 1464.99MB at 81.97MB/s
6.14% of 1464.99MB at 104.17MB/s

We at the moment are capable of download movies a lot quicker. During my assessments, sure download have been shut to completely using a 1 Gb/s connection. However, the typical speeds sometimes ranged between 50-70 MB/s or 400-560 Mb/s, which remains to be fairly quick.

Post-processing

YouTube distributes the video and audio channels in two separate recordsdata. This strategy helps save house, as an HD or UHD video can reuse the identical audio file. Additionally, some movies now provide totally different audio channels based mostly on the language. Therefore, the ultimate step is to mix these two channels right into a single file, and for that, we will merely use ffmpeg.

/**
 * Using ffmpeg, combien the audio and video file into one
 */asyncoperatecombineChannels(destinationFile,videoFile,audioFile){awaitfs.guarantees.rm(destinationFile,{drive:true});child_process.spawnSync('ffmpeg',["-y","-i",videoFile,"-i",audioFile,"-c","copy","-map","0:v:0","-map","1:a:0",destinationFile]);awaitfs.guarantees.rm(videoFile,{drive:true});awaitfs.guarantees.rm(audioFile,{drive:true});}

Finally, for these , the complete code may be downloaded right here.

Conclusion

Many initiatives at present use these strategies to bypass the restrictions put in place by YouTube as a way to forestall video downloads. The hottest one is yt-dlp (a fork of youtube-dl) programmed in Python, however it contains its personal customized JavaScript interpreter to remodel the n parameter.

yt-dlp: https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/youtube.py
VLC media participant: https://github.com/videolan/vlc/blob/master/share/lua/playlist/youtube.lua
NewPipe: https://github.com/Theta-Dev/NewPipeExtractor/blob/dev/extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeJavaScriptExtractor.java
node-ytdl-core: https://github.com/fent/node-ytdl-core/blob/master/lib/sig.js

…. to be continued
Read the Original Article
Copyright for syndicated content material belongs to the linked Source : Hacker News – https://blog.0x7d0.dev/history/how-they-bypass-youtube-video-download-throttling/