此文是关于 JS 从代码到执行整个过程的窥探 和 了解 V8 引擎的相关细节。

under the hood

下面是 V8 5.8 版本及以前的东西:

问题

  • 事件如何响应: 浏览器 UI click 到 JS 引擎处理?? 如何进行。

V8 5.8

以下谈论的都是 5.8 版本

Two compilers

V8 has 2 compilers, full-codegen and Crankshaft.

Full-codegen

  • Fast
  • Produces slow, unoptimized code
  • Initially, all code is compiled with full-codegen (lazily)
  • Doesn’t have an internal representation (IR); directly creates machine language (1-register stack machine) based on the abstract syntax tree (AST)

Crankshaft

  • Slow
  • Produces fast, optimized code
  • Only some functions are crankshafted (i.e., the unoptimized code generated by full-codegen is replaced with the optimized code generated by crankshaft) when V8 notices the functions are hot
    • it’s possible to crankshaft a function while executing it (e.g., inside a loop), this is called “on stack replacement”
  • Crankshafted code relies on assumptions on the execution (e.g., which type of an object gets passed to a function). If those are violated, we bail out and replace the crankshafted function with unoptimized code
    • this also happens on the fly, i.e., when we’re already executing the function!
  • 2 internal representations, HIR and LIR

V8 有多个线程处理代码

JS 是单线程,但是引擎不是。

  • The main thread, to fetch the code, compile it and then execute it
  • Another thread for optimizing the code while the main thread is getting compiled
  • Some of them are used for Garbage Handling, allocating the memory and sweeping the unused memory
  • These threads are used to see how optimizing works

Optimizer

优化: 优化处理随着版本更新而有大不同,主要学习思考模型,不必深究每个点。

  • inlining
  • Hidden Class: https://getpocket.com/a/read/399108125
  • Inlining Caching
  • Garbage Collection (GC) Technique
    • V8 uses the traditional GC approach. It is also called the mark n sweep approach. It uses incremental marking of every possible object and walks the marked part of the heap instead of trying to execute the whole heap. Trying to walk heap parts step by step with the time gaps. The sweep phase will be handled by a separate thread. 参考
  • Tagged Values
    • V8 uses 32 bit for both the object and the number. It uses a bit for identifying an object or an integer. For an object ( flag=1) and for an integer ( flag=0). The code will run fast if it does not have various data types like strings, doubles, etc, thus our V8 can only relate to the integers and ignore others.
  • Array handling
    • V8 has two methods for handling arrays
      • Fast handling: In this method, it uses linear storage buffer for the arrays which have compact keys.
      • Dictionary Elements: It is more complex than the fast handling technique and is used for the arrays which have sparse arrays. This technique uses a Hashtable.

Reference

How JavaScript works

简单理一下:

  • 源代码
  • 词法分析(lexical analyser)生成 tokens
  • 语法分析(parser) tokens 生成 AST
  • 引擎拿到 AST 边解释边执行(生成 机器码 machine code)。 【一些优化手段,具体就是不同引擎的事了】【在真正执行前,有个预解析阶段】
    • 预解析: 执行上下文之前所做的事。


code -> Token -> AST

  • JavaScript Engine
    • code -> AST ->
    • Runtime
      • web 接口
      • Event Loop
    • Call Stack


The Engine consists of two main components:

  • Memory Heap — this is where the memory allocation happens
  • Call Stack — this is where your stack frames are as your code executes

细节处理

回归下细节: 补一波 MDN 的文档来看看。

下面基本是写代码时语句语法相关的需要知道的细节。

关键字: 变量声明,提升,执行上下文,变量对象-活动对象, 作用域链, 原型继承, this,闭包

  • 变量声明
    • 预解析阶段
    • 我们都知道变量和函数有个 Hoisting, 那么具体是什么呢? 预解析。
  • 参数与变量
    • 重复声明的情况处理
var a = 10;
var a;
console.log(a); //10

// 第二行的var a只是个声明,而不是赋值,除非在执行前的语句,都没有对a变量赋值的其他语句,a才会被js引擎当作undefined值
// 关于这里得情况就需要看 预解析阶段做的事了。
// 区分 声明 和 赋值

Complier 编译器原理

编译器部分的内容了解个大概就可以了,不用深究每个细节。

但目前确实需要学一门相对底层的语言了, 编译器原理真的很重要。

A simple compiler might have a four-step process: a lexer, a parser, a translator and an interpreter.

  • The lexer, or lexical analyser (or scanner, or tokeniser) scans your source code and turns it into atomic units called tokens. This is most commonly achieved by pattern matching using regular expressions.
  • The tokenised code is then passed through a parser to identify and encode its structure and scope into what’s called a syntax tree.
  • This graph-like structure is then passed through a translator to be turned into bytecode. The simplest implementation of which would be a huge switch statement mapping tokens to their bytecode equivalent.
  • The bytecode is then passed to a bytecode interpreter to be turned into native code(machine code) and rendered.

Reference


5.9

The new execution pipeline is built on top of Ignition, V8’s interpreter, and TurboFan, V8’s newest optimizing compiler.

新版引擎下才 13 个评论,这种核心生产力的东西获得的关注太少了点, 而随便一个应用级的前端框架却能获得很多关注。还是太浮躁了。所以应该向基础靠拢。

关于 V8 的最新消失,还是看官方博客,但是太多内容,大版本才做跟踪了解,其余的算了吧。


细节一些的说明

没有细看说的哪个版本的 V8: