JavaScript逆向爬虫

​ 随着前端技术的发展,前端代码的打包技术、混淆技术、加密技术也层出不穷,各个公司可以在前端对JavaScript代码采取一定的保护,比如变量混淆、执行逻辑混淆、反调试、核心逻辑加密等,这些保护手段使得我们没法很轻易地找出JavaScript代码中包含的执行逻辑。

​ 针对这些反爬防护措施,解决方案:逆向JavaScript代码,找出其中的加密逻辑,直接实现该加密逻辑进行爬取。如果加密逻辑过于复杂,我们也可以找出一些关键入口,从而实现对加密逻辑的单独模拟执行和数据爬取。常用的JavaScript逆向技巧,包括浏览器工具的使用、Hook技术、AST技术、特殊混淆技术的处理、WebAssembly技术的处理。掌握这些技术,可以更从容的应对JavaScript防护技术。

网站加密和混淆技术简介

爬取网站的时候,会遇到一些需要分析接口或URL信息的情况,这时会有各种各样类似加密的情形。

  • 某个网站的URL带有一些看不太懂的长串加密参数,要抓取就必须懂得这些参数是怎么构造的,否则我们连完整的URL都构造不出来,更不用说爬取了。
  • 在分析某个网站的Ajax接口时,可以看到接口的一些参数也是加密的,Request Headers里面也可能带有一些加密参数,如果不知道这些参数的具体构造逻辑,就没法直接用程序模拟这些Ajax请求。
  • 查看网站的JavaScript源代码,可以发现很多压缩了或者看不太懂的字符,比如JavaScript文件名编码,文件的内容被压缩成几行,变量被修改成单个字符或者一些十六进制字符……这些导致我们无法轻易根据JavaScript源代码找出某些接口的加密逻辑。

以上的保护措施,可以归为两大类:

  • URL/API参数加密
  • JavaScript压缩、混淆和加密
  1. 网站数据防护方案

    • URL/API参数加密

    比如客户端和服务端约定一种接口校验逻辑,客户端在每次请求服务端接口的时候都会附带一个sign参数,这个sign参数可能是由当前时间信息、请求的URL、请求的数据、设备的ID、双方约定好的密钥经过一些加密算法构造而成的,客户端会实现这个加密算法来构造sign,然后每次请求服务器的时候附带这个参数。服务端会根据约定好的算法和请求的数据对sign进行校验,只有校验通过,才返回对应的数据,否则拒绝响应。

    登录状态的校验也可以看作此类方案,比如一个API的调用必须传一个token,这个token必须在用户登录后才能获取,如果请求的时候不带token,API就不会返回任何数据。

    • JavaScript压缩、混淆和加密

    接口加密技术看起来的确是一个不错的解决方案,但单纯依靠它并不能很好地解决问题,为啥呢?

    对于网页来说,其逻辑是依赖于JavaScript来实现的。JavaScript有如下特点。

    • JavaScript代码运行于客户端,也就是它必须在用户浏览器加载并运行。

    • JavaScript代码是公开透明的,也就是说浏览器可以直接获取到正在运行的JavaScript的源码。

      基于这两个原因,JavaScript代码是不安全的,任何人都可以读、分析、复制、盗用甚至篡改代码。

      想要实现数据的防护和安全,那么就需要用到JavaScript压缩、混淆和加密技术了。

    压缩、混淆和加密技术简述如下:

    a. JavaScript压缩

    • Minification(最小化):通过删除注释、空格和不必要的字符,将JavaScript代码压缩到最小体积,从而减少文件大小以提高加载速度。工具如UglifyJS、Terser等可用于最小化JavaScript代码。
    • Tree Shaking:移除未使用的代码,只保留被引用的部分,进一步减小文件体积。

    b. JavaScript混淆

    • 变量重命名:将变量、函数名和参数进行混淆,使代码更难以理解。例如将var userName = 'John';变成var a = 'John';
    • 字符串混淆:对字符串进行编码或加密,使其在代码中难以识别。例如将明文字符串转换为Base64编码。
    • 函数内联:将函数内联到调用处,减少函数调用开销并增加代码复杂度。

    c. JavaScript加密

    • 加密算法:使用加密算法(如AES、RSA)对JavaScript代码进行加密,在运行时解密执行。这样可以保护代码逻辑,但会增加性能开销。
    • 解密密钥:确保密钥安全存储,以免遭到恶意获取。
  2. URL/API参数加密

    现在绝大多数网站的数据一般都是通过服务器提供的API来获取的,网站或App可以请求某个时间API获取到对应的数据,然后再把获取的数据展示出来。不同API的实现对应着不同的安全防护级别。

    为了提升接口的安全性,客户端会和服务端约定一种接口校验方式,一般来说会用到各种加密和编码算法,如Base64、Hex编码、MD5、AES、DES、RSA等对称或非对称加密。

    例如客户端和服务端约定一个sign用作接口的签名校验,其生成逻辑是客户端URL路径进行MD5加密,然后拼接上URL的某个参数再进行Base64编码,最后得到一个字符串sign,这个sign会通过Request URL的某个参数或Request Headers发送给服务器。服务器接收到请求后,对URL路径同样进行MD5加密,然后拼接上URL的某个参数,进行Base64编码,也会得到一个sign。接着比对生成的sign和客户端发来的sign是否一致,如果一致,就返回正确的结果,否则拒绝响应。如果有人想要调用这个接口的话,必须定义好sign的生成逻辑,否则无法正常调用接口的。

  3. JavaScript压缩

    通过删除注释、空格和不必要的字符,将JavaScript代码压缩到最小体积,从而减少文件大小以提高加载速度。最后输出的结果都压缩为几行内容,代码的可读性变得很差,同时也能提高网站的加载速度。

    如果仅仅是去除空格、换行这样的压缩方式,其实几乎没有任何防护作用的,因为这种压缩方式仅仅是降低了代码的直接可读性,因为我们有一些格式化工具可以轻松将JavaScript代码变得易读、比如利用IDE、在线工具或Chrome浏览器都能还原格式化的代码。

    举一个最简单的JavaScript压缩示例,原来的JavaScript代码是这样的:

function echo(stringA, stringB){
	const name = "Germey";
	alert("hello " + name);
	}

压缩之后就变成这样子:

function echo(d,c){const e="Germey";alert("hello "+e)};

这里的参数都被简化了,代码中的空格也去掉了,被压缩成一行,整体可读性降低了。
JavaScript压缩技术只能在很小程度上起到防护作用,要想真正提高防护效果,还得依靠JavaScript混淆和加密技术。

a. JavaScript混淆

JavaScript混淆技术是一种通过修改源代码的结构、格式和命名规则,使其难以阅读和理解的方法。混淆技术可以增加代码的复杂性,提高代码的安全性,防止他人轻易理解和反编译代码。以下是一些常见的JavaScript混淆技术:

b. 变量重命名:将变量、函数名和参数重新命名为无意义的短字符或符号,使代码更加晦涩。例如,将userName重命名为a

c. 字符串混淆:对字符串进行编码或加密处理,以防止直接查看和识别明文内容。例如,将明文字符串转换为Base64编码或使用简单的替换算法。

d. 控制流转换:修改代码中的条件语句和循环结构,使代码的执行顺序变得随机化或不易理解。这可以通过代码展开、逻辑重组等方式实现。

e. 函数内联:将函数内容内联到调用处,减少函数调用开销,并增加代码复杂性。这可以使代码更加难以理解,尤其在函数数量众多的情况下。

f. 代码拆分:将代码拆分成多个文件或模块,然后混合和重组这些模块,使得代码的执行路径变得更加难以跟踪和理解。

g. 虚假代码插入:在代码中插入一些没有实际作用的代码片段,以增加代码量和混淆对手。

h. 死代码注入:在代码中添加无效的代码块或条件,使反混淆者难以分辨哪些代码段是实际有效的。

i. 混淆器工具:使用专门设计的JavaScript混淆器工具,如UglifyJS、Obfuscator.io、Javascript-obfuscator等,自动化实现代码混淆。

j. **域名锁定:**使JavaScript代码只能在指定域名下执行。

k. **特殊编码:**使JavaScript完全编码为人不可读的代码,如表情符号、特殊表示内容,等等。

以上方案都是JavaScript混淆的实现方式,可以在不同程度上保护JavaScript代码。

在前端开发中,现在JavaScript混淆的主流实现是javascript-obfuscator和terser这两个库。它们都能提供一些代码混淆功能,也都有对应的webpack和Rollup打包工具插件。利用它们,可以方便地实现页面的混淆,最终输出压缩和混淆后的JavaScript代码,使得JavaScript代码的可读性大大降低。

以javascript-obfuscator为例,首先,需要安装好Node.js 12.x及以上版本,确保可以正常使用npm命令,具体的安装方式可以参考:https://setup.scrape.center/nodejs。

接着新建一个文件夹,比如js-obfuscate,然后进入该文件夹,初始化工作空间:

npm init

这里会提示我们输入一些信息,然后创建package.json文件,这就完成了项目初始化。

接下来,我们来安装javascript-obfuscator文件夹:

npm i -D javascript-obfuscator

稍等片刻,可看到本地js-obfuscate文件夹下生成了一个node_modules文件夹,里面包含了javascript-obfuscator这个库,说明安装成功了。

JavaScript逆向爬虫-LMLPHP

接下来,我们可以编写代码来实现一个混淆样例了。比如,新建main.js文件,其内容如下:

const code = `
let x = '1' + 1
console.log('x', x)
`

const options = {
    compact:false,
    controlFlowFlattening:true
}

const obfuscator = require('javascript-obfuscator')
function obfuscate(code, options){
    return obfuscator.obfuscate(code, options).getObfuscatedCode()
}
console.log(obfuscate(code, options))

这里定义了两个变量:一个是code,需要被混淆的代码;另一个混淆选项options,是一个Object。接下来,引入javascript-obfuscator这个库,然后定义了一个方法,给其传入code和options来获取混淆后的代码,最后控制台输出混淆后的代码。

代码逻辑比较简单,执行一下代码:

node main.js

输出结果如下:

function _0x3901(_0x397ea4, _0x34af54) {
    const _0x1492ed = _0x1492();
    return _0x3901 = function (_0x3901b0, _0x3bfff4) {
        _0x3901b0 = _0x3901b0 - 0xc4;
        let _0x517f84 = _0x1492ed[_0x3901b0];
        return _0x517f84;
    }, _0x3901(_0x397ea4, _0x34af54);
}
(function (_0x241ce0, _0x212e8f) {
    const _0x18ac11 = _0x3901, _0x50c203 = _0x241ce0();
    while (!![]) {
        try {
            const _0x2b0064 = -parseInt(_0x18ac11(0xc9)) / 0x1 * (parseInt(_0x18ac11(0xca)) / 0x2) + -parseInt(_0x18ac11(0xc6)) / 0x3 * (-parseInt(_0x18ac11(0xcb)) / 0x4) + -parseInt(_0x18ac11(0xc4)) / 0x5 * (parseInt(_0x18ac11(0xcd)) / 0x6) + -parseInt(_0x18ac11(0xce)) / 0x7 * (-parseInt(_0x18ac11(0xcf)) / 0x8) + -parseInt(_0x18ac11(0xc5)) / 0x9 * (parseInt(_0x18ac11(0xc7)) / 0xa) + parseInt(_0x18ac11(0xc8)) / 0xb + -parseInt(_0x18ac11(0xcc)) / 0xc;
            if (_0x2b0064 === _0x212e8f)
                break;
            else
                _0x50c203['push'](_0x50c203['shift']());
        } catch (_0x34bddd) {
            _0x50c203['push'](_0x50c203['shift']());
        }
    }
}(_0x1492, 0x26b62));
let x = '1' + 0x1;
console['log']('x', x);
function _0x1492() {
    const _0x3a35ae = [
        '15002jmoENJ',
        '588lKEAkD',
        '155052pDmOSa',
        '28602jVtPmw',
        '175jXAnkL',
        '30952fAYWHu',
        '20PjKACo',
        '19017dFTBFv',
        '3360ROyzrR',
        '500TfAfCx',
        '960773RpCnER',
        '7nRkobq'
    ];
    _0x1492 = function () {
        return _0x3a35ae;
    };
    return _0x1492();
}

看到了吧,那么简单的代码,被我们混淆成了这个样子,其实这里我们就是设定了“控制流平坦化”选项。整体看来, 代码的可读性大大降低了,JavaScript调试的难度也大大加大了。

  • 代码压缩

这里javascript-obfuscator也提供了代码压缩的功能,使用其参数compact即可完成JavaScript代码的压缩,输出为一行内容。参数compact的默认值是true,如果定义为false,则混淆后的代码会分行显示。

示例如下:

const code = `
let x = '1' + 1
console.log('x', x)
`

const options = {
    compact:false,
}

这里我们先把代码压缩选项的参数compact设置为false,运行结果如下:

let x = '1' + 0x1;
console['log']('x', x);

如果不设置compact 或者把compact设置为true,结果如下:

 node main.js
const _0x14bd37=_0x49a1;(function(_0x239bc0,_0x1e0569){const _0x2eaf9f=_0x49a1,_0x50afa7=_0x239bc0();while(!![]){try{const _0x43de9a=parseInt(_0x2eaf9f(0x1aa))/0x1+parseInt(_0x2eaf9f(0x1a3))/0x2*(-parseInt(_0x2eaf9f(0x1a9))/0x3)+parseInt(_0x2eaf9f(0x1a4))/0x4*(-parseInt(_0x2eaf9f(0x1a5))/0x5)+-parseInt(_0x2eaf9f(0x1a8))/0x6+parseInt(_0x2eaf9f(0x1a7))/0x7*(parseInt(_0x2eaf9f(0x1ac))/0x8)+-parseInt(_0x2eaf9f(0x1ab))/0x9*(-parseInt(_0x2eaf9f(0x1a2))/0xa)+-parseInt(_0x2eaf9f(0x1a6))/0xb*(parseInt(_0x2eaf9f(0x1ae))/0xc);if(_0x43de9a===_0x1e0569)break;else _0x50afa7['push'](_0x50afa7['shift']());}catch(_0x57e668){_0x50afa7['push'](_0x50afa7['shift']());}}}(_0x23d7,0xabd3b));function _0x23d7(){const _0x267d9d=['44wvZcig','7FZYYIp','2009382VavdPg','57klbHPK','209656DtLYLS','2295lXMyOh','10153928QCeIxA','log','1038552qpDhiQ','46390KjUfWe','123454QvBTGy','212UuWPlk','9825WBKkTx'];_0x23d7=function(){return _0x267d9d;};return _0x23d7();}let x='1'+0x1;function _0x49a1(_0x438ab1,_0x261688){const _0x23d744=_0x23d7();return _0x49a1=function(_0x49a144,_0x41042b){_0x49a144=_0x49a144-0x1a2;let _0x39db73=_0x23d744[_0x49a144];return _0x39db73;},_0x49a1(_0x438ab1,_0x261688);}console[_0x14bd37(0x1ad)]('x',x);
  • 变量名混淆

变量名混淆可以通过在javascript-obfuscator中配置identifierNamesGenerator参数来实现。我们通过这个参数可以控制变量名混淆的方式,如将其设置为hexadecimal,则会将变量名替换为十六进制形式的字符串。该参数的取值如下。

  1. hexadecimal: 将变量名替换为十六进制形式的字符串,如0xabc123
  2. mangled: 将变量名替换为普通的简写字符,如a、b、c等。

该参数的默认值为hexadecimal。

我们将该参数修改为mangled来试一下:

const code = `
let hello = '1' + 1
console.log('hello', hello)
`

const options = {
  compact: true,
  identifierNamesGenerator: 'mangled'
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

const i=b;function b(c,d){const e=a();return b=function(f,g){f=f-0x19c;let h=e[f];return h;},b(c,d);}(function(c,d){const h=b,e=c();while(!![]){try{const f=-parseInt(h(0x1a9))/0x1*(-parseInt(h(0x19f))/0x2)+parseInt(h(0x1a8))/0x3*(-parseInt(h(0x19d))/0x4)+-parseInt(h(0x19c))/0x5+-parseInt(h(0x1a6))/0x6*(parseInt(h(0x1a2))/0x7)+parseInt(h(0x1a5))/0x8*(parseInt(h(0x1a4))/0x9)+-parseInt(h(0x1a0))/0xa*(parseInt(h(0x1a7))/0xb)+parseInt(h(0x1a1))/0xc;if(f===d)break;else e['push'](e['shift']());}catch(g){e['push'](e['shift']());}}}(a,0x86880));function a(){const j=['62268DcCuSV','log','489546hKAdvn','830bWwpWA','21753120Qqqfbu','4049059RQKUcm','hello','1656ATTPcc','19520sPNVyO','6qzRZVj','119262bBRGeV','21HMbwOh','1dPsZVe','1840805sOUQnD'];a=function(){return j;};return a();}let hello='1'+0x1;console[i(0x19e)](i(0x1a3),hello);

可以看到,这里的变量名都变成了a、b等形式。

如果我们将identifierNamesGenerator修改为hexadecimal或者不设置,运行结果如下:

function _0x37d7(){const _0x1ce359=['108054HKfmiQ','6618LoSOAr','390woRCwq','log','22932JgBpnU','5214528ZoEooO','199884DMktwc','hello','265208hUtRwu','90UGPBTa','1229018eMvTDk','6ukGXeH','272uFYvwl','10eWBJwn'];_0x37d7=function(){return _0x1ce359;};return _0x37d7();}const _0x3239af=_0x288b;(function(_0xadb17f,_0x287df9){const _0x2424ec=_0x288b,_0x169d6a=_0xadb17f();while(!![]){try{const _0x500ead=-parseInt(_0x2424ec(0x19c))/0x1+parseInt(_0x2424ec(0x1a8))/0x2*(-parseInt(_0x2424ec(0x19d))/0x3)+parseInt(_0x2424ec(0x1a0))/0x4*(parseInt(_0x2424ec(0x19e))/0x5)+-parseInt(_0x2424ec(0x1a7))/0x6*(-parseInt(_0x2424ec(0x1a6))/0x7)+parseInt(_0x2424ec(0x1a4))/0x8*(-parseInt(_0x2424ec(0x1a5))/0x9)+parseInt(_0x2424ec(0x1a9))/0xa*(parseInt(_0x2424ec(0x1a1))/0xb)+-parseInt(_0x2424ec(0x1a2))/0xc;if(_0x500ead===_0x287df9)break;else _0x169d6a['push'](_0x169d6a['shift']());}catch(_0x554ebe){_0x169d6a['push'](_0x169d6a['shift']());}}}(_0x37d7,0x5324f));function _0x288b(_0x51ec13,_0x1ecade){const _0x37d7cf=_0x37d7();return _0x288b=function(_0x288b05,_0x189c08){_0x288b05=_0x288b05-0x19c;let _0x200f58=_0x37d7cf[_0x288b05];return _0x200f58;},_0x288b(_0x51ec13,_0x1ecade);}let hello='1'+0x1;console[_0x3239af(0x19f)](_0x3239af(0x1a3),hello);

可以看到,选用了mangled,其代码体积更小,但选用hexadecimal到可读性会更低。另外,还可以通过设置identifiersPrefix参数来控制混淆后到变量前缀,示例如下:

const Bruce_Python_0x16e5d9=Bruce_Python_0x2de5;(function(_0x377998,_0x130a48){const _0x367270=Bruce_Python_0x2de5,_0x3a480b=_0x377998();while(!![]){try{const _0x13b168=-parseInt(_0x367270(0x126))/0x1+parseInt(_0x367270(0x129))/0x2*(parseInt(_0x367270(0x12b))/0x3)+-parseInt(_0x367270(0x12c))/0x4+parseInt(_0x367270(0x128))/0x5*(-parseInt(_0x367270(0x12d))/0x6)+-parseInt(_0x367270(0x124))/0x7*(-parseInt(_0x367270(0x12a))/0x8)+parseInt(_0x367270(0x125))/0x9+parseInt(_0x367270(0x127))/0xa*(parseInt(_0x367270(0x12e))/0xb);if(_0x13b168===_0x130a48)break;else _0x3a480b['push'](_0x3a480b['shift']());}catch(_0x406457){_0x3a480b['push'](_0x3a480b['shift']());}}}(Bruce_Python_0x4f65,0xa5772));let hello='1'+0x1;function Bruce_Python_0x2de5(_0x2ff7c3,_0x466caa){const _0x4f6530=Bruce_Python_0x4f65();return Bruce_Python_0x2de5=function(_0x2de5dc,_0x5c68da){_0x2de5dc=_0x2de5dc-0x123;let _0x165420=_0x4f6530[_0x2de5dc];return _0x165420;},Bruce_Python_0x2de5(_0x2ff7c3,_0x466caa);}function Bruce_Python_0x4f65(){const _0x130db5=['1533IGMGmo','9924147MQciRr','125954UYcDQH','320dyLAeJ','20DNpUrz','65234gnvlgS','12088jYJNMe','39ZUzLTh','1880324umpKui','919140UWcOXh','9944UgQJza','hello'];Bruce_Python_0x4f65=function(){return _0x130db5;};return Bruce_Python_0x4f65();}console['log'](Bruce_Python_0x16e5d9(0x123),hello);

可以看到,混淆后到变量前缀加上了我们自定义的字符串Bruce_Python。

另外,renameGlobals这个参数还可以指定是否混淆全局变量和函数名称,默认值为false。示例如下:

const code = `
var $ = function(id) {
    return document.getElementById(id);
};
`

const options = {
  renameGlobals: true
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

function _0x2cb7(){var _0x560738=['170OFwIEg','12LYKASp','5292aVIqMb','1416DlJWQM','1741341kTeKzD','49675kJHwxA','getElementById','344905vYLQcM','158340cXVPdf','48RrdgiG','824965bJsnOe','17058nnhrVL'];_0x2cb7=function(){return _0x560738;};return _0x2cb7();}function _0x4a01(_0x4dd1a0,_0x2e7701){var _0x2cb7ef=_0x2cb7();return _0x4a01=function(_0x4a0117,_0x2156e2){_0x4a0117=_0x4a0117-0x91;var _0x5b33ef=_0x2cb7ef[_0x4a0117];return _0x5b33ef;},_0x4a01(_0x4dd1a0,_0x2e7701);}(function(_0x3062c4,_0x1bc3bb){var _0x385e8a=_0x4a01,_0x3635d0=_0x3062c4();while(!![]){try{var _0x3aa207=-parseInt(_0x385e8a(0x9c))/0x1+parseInt(_0x385e8a(0x96))/0x2*(parseInt(_0x385e8a(0x94))/0x3)+-parseInt(_0x385e8a(0x93))/0x4+-parseInt(_0x385e8a(0x95))/0x5*(-parseInt(_0x385e8a(0x98))/0x6)+parseInt(_0x385e8a(0x9b))/0x7+parseInt(_0x385e8a(0x9a))/0x8*(parseInt(_0x385e8a(0x99))/0x9)+-parseInt(_0x385e8a(0x97))/0xa*(parseInt(_0x385e8a(0x92))/0xb);if(_0x3aa207===_0x1bc3bb)break;else _0x3635d0['push'](_0x3635d0['shift']());}catch(_0x21ef66){_0x3635d0['push'](_0x3635d0['shift']());}}}(_0x2cb7,0x30182));var _0x418f04=function(_0x430019){var _0x4f0e1d=_0x4a01;return document[_0x4f0e1d(0x91)](_0x430019);};

可以看到,这里我们声明了一个全局变量 ,在 r e n a m e G l o b a l s 设置为 t r u e 后, ,在renameGlobals设置为true后, ,在renameGlobals设置为true后,这个变量也被替换了。如果后文用到了这个$对象,可能就会有找不到定义的错误,因此这个参数可能导致代码执行不通。

如果我们不设置renameGlobals或者将其设置为false,结果如下:

function _0x313e(_0x35e65a,_0x37f159){var _0x507669=_0x5076();return _0x313e=function(_0x313e90,_0x1e5b89){_0x313e90=_0x313e90-0x8a;var _0x1af528=_0x507669[_0x313e90];return _0x1af528;},_0x313e(_0x35e65a,_0x37f159);}(function(_0x38971b,_0x5ce6b6){var _0x13f61f=_0x313e,_0x47adcc=_0x38971b();while(!![]){try{var _0x28ae62=parseInt(_0x13f61f(0x8b))/0x1+-parseInt(_0x13f61f(0x92))/0x2+-parseInt(_0x13f61f(0x8c))/0x3*(parseInt(_0x13f61f(0x90))/0x4)+-parseInt(_0x13f61f(0x95))/0x5*(-parseInt(_0x13f61f(0x94))/0x6)+parseInt(_0x13f61f(0x8f))/0x7+parseInt(_0x13f61f(0x8d))/0x8*(parseInt(_0x13f61f(0x8e))/0x9)+parseInt(_0x13f61f(0x91))/0xa*(-parseInt(_0x13f61f(0x93))/0xb);if(_0x28ae62===_0x5ce6b6)break;else _0x47adcc['push'](_0x47adcc['shift']());}catch(_0x4fe759){_0x47adcc['push'](_0x47adcc['shift']());}}}(_0x5076,0x49b8f));function _0x5076(){var _0x182894=['711mjFXTq','998053ISmizq','476996SAPyAI','1184870WQSWpq','578726rrcplt','22swJLmk','1002VpjrKz','9355xnYNZa','getElementById','153250knZupB','6wreAkH','46432iFHoda'];_0x5076=function(){return _0x182894;};return _0x5076();}var $=function(_0x2d3ba6){var _0x29ed73=_0x313e;return document[_0x29ed73(0x8a)](_0x2d3ba6);};

可以看到,最后还是有$的声明,其全局名称没有被改变。

  • 字符串混淆

字符串混淆,即将一个字符串声明放到一个数组里面,使之无法被直接搜到。这可以通过stringArray参数来控制,默认为true。此外,我们还可以通过rotateStringArray参数来控制数组化后结果的元素顺序,默认为true。还可以通过stringArrayEncoding参数来控制数组的编码形式,默认不开启编码。如果将其设置为true或base64,则会使用Base64编码;如果设置为rc4,则使用RC4编码。另外,可以通过stringArrayThreshold来控制启用编码的概率,其范围为0到1,默认值为0.8。示例如下:

const code = `
var a = 'hello world'   
`
const options = {
  stringArray: true,
  rotateStringArray: true,
  stringArrayEncoding: true, // 'base64' or 'rc4' or false
  stringArrayThreshold: 1
  ,
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

var _0x348927=_0x186e;function _0x2437(_0x569362,_0x161a7e){var _0x29ec4d=_0x29ec();return _0x2437=function(_0x186ec2,_0x323e01){_0x186ec2=_0x186ec2-0xcb;var _0x279a37=_0x29ec4d[_0x186ec2];if(_0x2437['YrNdSE']===undefined){var _0x2bcac3=function(_0x3fc915){var _0x3fef53='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/=';var _0x20caa2='',_0x6b64eb='';for(var _0x57de42=0x0,_0xa2eb39,_0x291c79,_0xfc26d=0x0;_0x291c79=_0x3fc915['charAt'](_0xfc26d++);~_0x291c79&&(_0xa2eb39=_0x57de42%0x4?_0xa2eb39*0x40+_0x291c79:_0x291c79,_0x57de42++%0x4)?_0x20caa2+=String['fromCharCode'](0xff&_0xa2eb39>>(-0x2*_0x57de42&0x6)):0x0){_0x291c79=_0x3fef53['indexOf'](_0x291c79);}for(var _0x1ee15d=0x0,_0x38dec2=_0x20caa2['length'];_0x1ee15d<_0x38dec2;_0x1ee15d++){_0x6b64eb+='%'+('00'+_0x20caa2['charCodeAt'](_0x1ee15d)['toString'](0x10))['slice'](-0x2);}return decodeURIComponent(_0x6b64eb);};var _0x243785=function(_0x20adaf,_0x376e17){var _0x6c264c=[],_0x3434d7=0x0,_0x370984,_0x568422='';_0x20adaf=_0x2bcac3(_0x20adaf);var _0x43cd15;for(_0x43cd15=0x0;_0x43cd15<0x100;_0x43cd15++){_0x6c264c[_0x43cd15]=_0x43cd15;}for(_0x43cd15=0x0;_0x43cd15<0x100;_0x43cd15++){_0x3434d7=(_0x3434d7+_0x6c264c[_0x43cd15]+_0x376e17['charCodeAt'](_0x43cd15%_0x376e17['length']))%0x100,_0x370984=_0x6c264c[_0x43cd15],_0x6c264c[_0x43cd15]=_0x6c264c[_0x3434d7],_0x6c264c[_0x3434d7]=_0x370984;}_0x43cd15=0x0,_0x3434d7=0x0;for(var _0x1c5b38=0x0;_0x1c5b38<_0x20adaf['length'];_0x1c5b38++){_0x43cd15=(_0x43cd15+0x1)%0x100,_0x3434d7=(_0x3434d7+_0x6c264c[_0x43cd15])%0x100,_0x370984=_0x6c264c[_0x43cd15],_0x6c264c[_0x43cd15]=_0x6c264c[_0x3434d7],_0x6c264c[_0x3434d7]=_0x370984,_0x568422+=String['fromCharCode'](_0x20adaf['charCodeAt'](_0x1c5b38)^_0x6c264c[(_0x6c264c[_0x43cd15]+_0x6c264c[_0x3434d7])%0x100]);}return _0x568422;};_0x2437['TYlecL']=_0x243785,_0x569362=arguments,_0x2437['YrNdSE']=!![];}var _0x3af191=_0x29ec4d[0x0],_0x1407eb=_0x186ec2+_0x3af191,_0x4a5e9a=_0x569362[_0x1407eb];return!_0x4a5e9a?(_0x2437['AFUziT']===undefined&&(_0x2437['AFUziT']=!![]),_0x279a37=_0x2437['TYlecL'](_0x279a37,_0x323e01),_0x569362[_0x1407eb]=_0x279a37):_0x279a37=_0x4a5e9a,_0x279a37;},_0x2437(_0x569362,_0x161a7e);}function _0x29ec(){var _0x2d1bbe=['oeL1t0Dpsq','ntyWntq0mgjIv2z6BG','W4hdII/cI8ouW5VdR8oFWOJcUSo5WPb8','W4CulavHWOTWWPldGCkRW501W5a','eCoGrmoeW7TnDaJcRmkbW4HLyq','WOpcMmkzWONdNCkMW6u','W4WfW49oW6XHWPtcQSkcu3X9','WQpdUCoEe8ofW4zDuctcSSo0zCkhW6O','wJpdVmkvWRpcJcpcPSkfBSoCexC','WRVcMSoqFuD9W65kfSkivX0U','kmolpftdOdtdVa','ndu4nJG0teHWAKnI','WQNcQSkasmkxWROn','owfVAgfeAa','AgvSBg8GD29YBgq','W7xcTdNdMmkZW43dL0RcRgyDqa','WOZdPmoFW7/cQmorWP5nW6RcJ8o7W5bO','emkevCkwWPVdRZq','otm5nJa1A3P4D0ft','mtKXmtC2ngDIvurxsW','W4FdJI7cJCotW5FdP8knWP3cI8oyWPjLDW'];_0x29ec=function(){return _0x2d1bbe;};return _0x29ec();}(function(_0x5d03a7,_0x32a43a){var _0x16808a=_0x186e,_0x363257=_0x2437,_0x51d111=_0x5d03a7();while(!![]){try{var _0xdf507f=parseInt(_0x363257(0xdc,'BKF('))/0x1*(-parseInt(_0x363257(0xd3,'y%JB'))/0x2)+parseInt(_0x16808a(0xd8))/0x3*(-parseInt(_0x16808a(0xd6))/0x4)+parseInt(_0x16808a(0xdd))/0x5+-parseInt(_0x16808a(0xcc))/0x6+-parseInt(_0x363257(0xce,'2AKB'))/0x7*(-parseInt(_0x16808a(0xcb))/0x8)+-parseInt(_0x363257(0xdb,'XaDc'))/0x9+parseInt(_0x363257(0xdf,'AhkU'))/0xa;if(_0xdf507f===_0x32a43a)break;else _0x51d111['push'](_0x51d111['shift']());}catch(_0x4c7127){_0x51d111['push'](_0x51d111['shift']());}}}(_0x29ec,0x771f3));function _0x186e(_0x569362,_0x161a7e){var _0x29ec4d=_0x29ec();return _0x186e=function(_0x186ec2,_0x323e01){_0x186ec2=_0x186ec2-0xcb;var _0x279a37=_0x29ec4d[_0x186ec2];if(_0x186e['cxplLu']===undefined){var _0x2bcac3=function(_0x243785){var _0x3fc915='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/=';var _0x3fef53='',_0x20caa2='';for(var _0x6b64eb=0x0,_0x57de42,_0xa2eb39,_0x291c79=0x0;_0xa2eb39=_0x243785['charAt'](_0x291c79++);~_0xa2eb39&&(_0x57de42=_0x6b64eb%0x4?_0x57de42*0x40+_0xa2eb39:_0xa2eb39,_0x6b64eb++%0x4)?_0x3fef53+=String['fromCharCode'](0xff&_0x57de42>>(-0x2*_0x6b64eb&0x6)):0x0){_0xa2eb39=_0x3fc915['indexOf'](_0xa2eb39);}for(var _0xfc26d=0x0,_0x1ee15d=_0x3fef53['length'];_0xfc26d<_0x1ee15d;_0xfc26d++){_0x20caa2+='%'+('00'+_0x3fef53['charCodeAt'](_0xfc26d)['toString'](0x10))['slice'](-0x2);}return decodeURIComponent(_0x20caa2);};_0x186e['xQSoyb']=_0x2bcac3,_0x569362=arguments,_0x186e['cxplLu']=!![];}var _0x3af191=_0x29ec4d[0x0],_0x1407eb=_0x186ec2+_0x3af191,_0x4a5e9a=_0x569362[_0x1407eb];return!_0x4a5e9a?(_0x279a37=_0x186e['xQSoyb'](_0x279a37),_0x569362[_0x1407eb]=_0x279a37):_0x279a37=_0x4a5e9a,_0x279a37;},_0x186e(_0x569362,_0x161a7e);}var a=_0x348927(0xd9);

可以看到,这里就把字符串进行了Base64编码,我们再也无法通过查找的方式找到字符串的位置了。如果将stringArray设置为false的话,输出就是这样:

var a='hello\x20world';

字符串就仍然是明文显示的,没有被编码。另外,我们还可以使用unicodeEscapeSequence这个参数对字符串进行Unicode转码,使之更加难以辨认,示例如下:

const code = `
var a = 'hello world'
`
const options = {
  compact: false,
  unicodeEscapeSequence: true
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

(function (_0x1b6a0d, _0x1c97c8) {
    var _0xd2d623 = _0x52e8, _0x217f4a = _0x1b6a0d();
    while (!![]) {
        try {
            var _0x2408ea = -parseInt(_0xd2d623(0xdf)) / 0x1 + parseInt(_0xd2d623(0xde)) / 0x2 + -parseInt(_0xd2d623(0xe1)) / 0x3 + -parseInt(_0xd2d623(0xe4)) / 0x4 + parseInt(_0xd2d623(0xe0)) / 0x5 + -parseInt(_0xd2d623(0xe3)) / 0x6 * (-parseInt(_0xd2d623(0xe2)) / 0x7) + -parseInt(_0xd2d623(0xdd)) / 0x8;
            if (_0x2408ea === _0x1c97c8)
                break;
            else
                _0x217f4a['push'](_0x217f4a['shift']());
        } catch (_0x7aa649) {
            _0x217f4a['push'](_0x217f4a['shift']());
        }
    }
}(_0x1aea, 0xf1bf2));
function _0x52e8(_0x4d9345, _0xe5f9cb) {
    var _0x1aeadb = _0x1aea();
    return _0x52e8 = function (_0x52e84a, _0x45526f) {
        _0x52e84a = _0x52e84a - 0xdd;
        var _0x253278 = _0x1aeadb[_0x52e84a];
        return _0x253278;
    }, _0x52e8(_0x4d9345, _0xe5f9cb);
}
function _0x1aea() {
    var _0x2c665d = [
        '\x31\x32\x39\x39\x32\x31\x34\x37\x69\x4f\x55\x69\x41\x61',
        '\x36\x7a\x57\x4b\x61\x6e\x77',
        '\x34\x32\x36\x39\x39\x34\x34\x75\x79\x75\x79\x69\x79',
        '\x33\x32\x39\x35\x33\x33\x36\x4e\x77\x57\x53\x4a\x42',
        '\x33\x39\x31\x39\x37\x30\x42\x63\x71\x79\x49\x53',
        '\x38\x39\x38\x33\x36\x31\x46\x6c\x50\x68\x44\x57',
        '\x39\x31\x39\x33\x33\x33\x35\x4d\x51\x77\x65\x43\x54',
        '\x31\x35\x36\x38\x31\x34\x35\x65\x70\x74\x6a\x77\x48'
    ];
    _0x1aea = function () {
        return _0x2c665d;
    };
    return _0x1aea();
}
var a = '\x68\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64';

可以看到,这里字符串被数字化和Unicode化,非常难以辨认。

在很多JavaScript逆向的过程中,一些关键的字符串可能会作为切入点来查找加密入口。用了这种混淆之后,如果有人想通过全局搜索的方式搜索hello这样的字符串找加密入口,也没法搜到了。

  • 代码自我保护

我们可以通过设置selfDefending参数来开启代码自我保护功能。开启之后,混淆后的JavaScript会强制以一行形式显示。如果我们将混淆后的代码进行格式化或者重命名,该段代码将无法执行。示例如下:

const code = `
console.log('hello world')
`

const options = {
  selfDefending: true
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

var _0x1b56f8=_0x3c50;function _0x44c6(){var _0x57fffd=['90xSYcHj','1466352DFEbcP','2556301FzbUEv','385059xBOYtr','255308fIGnns','256488kiENBq','10JJXZMY','10zxBYXG','hello\x20world','log','(((.+)+)+)+$','toString','48TIjOQD','34aBNcqb','constructor','search','313586SXQUdp','74373Weqtoq'];_0x44c6=function(){return _0x57fffd;};return _0x44c6();}function _0x3c50(_0x1b980a,_0x26ed0b){var _0x27acc5=_0x44c6();return _0x3c50=function(_0x279fc4,_0x55725f){_0x279fc4=_0x279fc4-0x138;var _0x44c649=_0x27acc5[_0x279fc4];return _0x44c649;},_0x3c50(_0x1b980a,_0x26ed0b);}(function(_0x3f4390,_0x55f028){var _0x54030a=_0x3c50,_0xe6d746=_0x3f4390();while(!![]){try{var _0x9b2181=-parseInt(_0x54030a(0x13b))/0x1+parseInt(_0x54030a(0x145))/0x2*(parseInt(_0x54030a(0x149))/0x3)+parseInt(_0x54030a(0x13c))/0x4*(-parseInt(_0x54030a(0x13e))/0x5)+parseInt(_0x54030a(0x144))/0x6*(-parseInt(_0x54030a(0x148))/0x7)+parseInt(_0x54030a(0x13d))/0x8*(parseInt(_0x54030a(0x138))/0x9)+parseInt(_0x54030a(0x13f))/0xa*(parseInt(_0x54030a(0x13a))/0xb)+parseInt(_0x54030a(0x139))/0xc;if(_0x9b2181===_0x55f028)break;else _0xe6d746['push'](_0xe6d746['shift']());}catch(_0x6dab8c){_0xe6d746['push'](_0xe6d746['shift']());}}}(_0x44c6,0x3710b));var _0x55725f=(function(){var _0x3b535c=!![];return function(_0x2040d0,_0x1e024e){var _0xa27340=_0x3b535c?function(){if(_0x1e024e){var _0xe5a99f=_0x1e024e['apply'](_0x2040d0,arguments);return _0x1e024e=null,_0xe5a99f;}}:function(){};return _0x3b535c=![],_0xa27340;};}()),_0x279fc4=_0x55725f(this,function(){var _0x201807=_0x3c50;return _0x279fc4[_0x201807(0x143)]()[_0x201807(0x147)](_0x201807(0x142))[_0x201807(0x143)]()[_0x201807(0x146)](_0x279fc4)[_0x201807(0x147)](_0x201807(0x142));});_0x279fc4(),console[_0x1b56f8(0x141)](_0x1b56f8(0x140));

如果我们将上述代码放到控制台,执行结果和之前一模一样的,没有任何问题。

JavaScript逆向爬虫-LMLPHP

  • 控制流平坦化

控制流平坦化其实就是将代码的执行逻辑混淆,使其变得复杂、难读。其基本思想是将一些逻辑处理块都统一加上一个前驱逻辑块,每个逻辑块都由前驱逻辑块进行条件判断和分发,构成一个个闭环逻辑,这导致整个执行逻辑十分复杂、难读。

这里有段示例代码:

console.log(c);
console.log(a);
console.log(b);

代码逻辑一目了然,依次在控制台输出了c、a、b三个变量值。但如果把这段代码进行控制流平坦化处理,代码就会变成这样:

const s = "3|1|2".split("|");
let x = 0;
while (true){
	switch (s[x++]){
		case "1":
			console.log(a);
			continue;
		case "2":
			console.log(b);
			continue;
		case "3":
			console.log(c);
			continue;
			
	}
	break;
}

可以看到,混淆后的代码首先声明了一个变量s,它的结果是一个列表,其实是[”3”, ”1”, ”2”],然后下面通过switch语句对s中的元素进行了判断,每个case都加上了各自的代码逻辑。通过这样的处理,一些连续的执行逻辑就被打破了,代码被修改为一个switch语句,原本我们可以一眼看出的逻辑是控制台先输出c,然后才是a、b,但是现在我们必须结合switch的判断条件和对应case的内容进行判断,我们很难再一眼看出每条语句的执行顺序,这大大降低了代码的可读性。

在javasript-obfuscator中,我们通过controlFlowFlattening变量可以控制是否开启控制流平坦化,示例如下:

const options = {
	compact: false,
	controlFlowFlattening: true
	}

使用控制流平坦化可以使得执行逻辑更加复杂、难读,目前非常多的前端混淆都会加上这个选项。但启用控制流平坦化之后,代码的执行时间会变长,最长大1.5倍之多。

另外,我们还能使用controlFlowFlatteningThreshold这个参数来控制比例,取值范围是0 到1,默认值0.75。如果将该参数设置为0,那相当于将controlFlowFlattening设置为false,即不开启控制流扁平化。

  • 无用代码注入

无用代码即不会被执行的代码或对上下文没有任何影响的代码,注入之后可以对现有的JavaScript代码的阅读形成干扰。我们可以使用deadCodeInjection参数开启这个选项,其默认值为false。

示例代码如下:

const code = `
console.log('hello world')
`

const options = {
  selfDefending: true
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

这里声明了方法a 和b,然后依次进行调用,分别输出两句话。经过无用代码注入处理之后,代码会变成这样:

(function (_0x5e48ef, _0x428808) {
    const _0x3f9afb = _0x5151, _0x16cb33 = _0x5e48ef();
    while (!![]) {
        try {
            const _0x4621c2 = -parseInt(_0x3f9afb(0x1e9)) / 0x1 * (-parseInt(_0x3f9afb(0x1e5)) / 0x2) + parseInt(_0x3f9afb(0x1e1)) / 0x3 + parseInt(_0x3f9afb(0x1ea)) / 0x4 + -parseInt(_0x3f9afb(0x1e8)) / 0x5 * (-parseInt(_0x3f9afb(0x1e3)) / 0x6) + -parseInt(_0x3f9afb(0x1e6)) / 0x7 + -parseInt(_0x3f9afb(0x1e4)) / 0x8 + -parseInt(_0x3f9afb(0x1e7)) / 0x9;
            if (_0x4621c2 === _0x428808)
                break;
            else
                _0x16cb33['push'](_0x16cb33['shift']());
        } catch (_0xe49682) {
            _0x16cb33['push'](_0x16cb33['shift']());
        }
    }
}(_0xf9fd, 0x52ef1));
const a = function () {
        const _0x445e1a = _0x5151;
        console[_0x445e1a(0x1e2)]('hello\x20world');
    }, b = function () {
        console['log']('nice\x20to\x20meet\x20you');
    };
function _0x5151(_0x1bea8c, _0x4611c5) {
    const _0xf9fdc7 = _0xf9fd();
    return _0x5151 = function (_0x515190, _0x28e2ae) {
        _0x515190 = _0x515190 - 0x1e1;
        let _0x459069 = _0xf9fdc7[_0x515190];
        return _0x459069;
    }, _0x5151(_0x1bea8c, _0x4611c5);
}
function _0xf9fd() {
    const _0x145385 = [
        '102rjAcjB',
        '4037216eCpEFT',
        '300xBNgHs',
        '2112922TJhNbR',
        '1031958MqkGJZ',
        '93895ZomTGw',
        '1300DSmZqn',
        '2621160RhOGdz',
        '273972HsWiAN',
        'log'
    ];
    _0xf9fd = function () {
        return _0x145385;
    };
    return _0xf9fd();
}
a(), b();

可以看到,每个方法内部都增加了额外的if…else语句,其中if的判断条件还是一个表达式,其结果是true还是false我们还不能一眼看出来。但是运行结果是完全一致的。如下图所示:

JavaScript逆向爬虫-LMLPHP

  • 对象键名替换

如果是一个对象,可以使用transformObjectKeys来对对象的键值进行替换,示例如下:

(function (_0x4a3c9e, _0x11470d) {
    var _0x43ce74 = _0x3495, _0x5eed25 = _0x4a3c9e();
    while (!![]) {
        try {
            var _0xa49492 = parseInt(_0x43ce74(0x186)) / 0x1 * (-parseInt(_0x43ce74(0x17f)) / 0x2) + -parseInt(_0x43ce74(0x18a)) / 0x3 * (parseInt(_0x43ce74(0x185)) / 0x4) + -parseInt(_0x43ce74(0x184)) / 0x5 + -parseInt(_0x43ce74(0x188)) / 0x6 + -parseInt(_0x43ce74(0x180)) / 0x7 * (-parseInt(_0x43ce74(0x187)) / 0x8) + -parseInt(_0x43ce74(0x18b)) / 0x9 + parseInt(_0x43ce74(0x18e)) / 0xa * (parseInt(_0x43ce74(0x18d)) / 0xb);
            if (_0xa49492 === _0x11470d)
                break;
            else
                _0x5eed25['push'](_0x5eed25['shift']());
        } catch (_0x5e25b4) {
            _0x5eed25['push'](_0x5eed25['shift']());
        }
    }
}(_0x5a77, 0x5b892), (function () {
    var _0x15c415 = _0x3495, _0x2b820c = {};
    _0x2b820c[_0x15c415(0x189)] = _0x15c415(0x182);
    var _0x20ee51 = {};
    _0x20ee51[_0x15c415(0x18c)] = _0x15c415(0x183), _0x20ee51[_0x15c415(0x181)] = _0x2b820c;
    var _0x1b69b9 = _0x20ee51;
}()));
function _0x3495(_0x519889, _0x407957) {
    var _0x5a7770 = _0x5a77();
    return _0x3495 = function (_0x3495e6, _0xa2fa2c) {
        _0x3495e6 = _0x3495e6 - 0x17f;
        var _0x4ce931 = _0x5a7770[_0x3495e6];
        return _0x4ce931;
    }, _0x3495(_0x519889, _0x407957);
}
function _0x5a77() {
    var _0x34ee00 = [
        '29168FqvoWB',
        '4274538FlVWrr',
        'baz',
        '102777xzCjUf',
        '6400350VUCeVe',
        'foo',
        '11UVKDWv',
        '19432430SqQONT',
        '2eQHxls',
        '1211yZLccH',
        'bar',
        'test2',
        'test1',
        '2080500Mcwxoy',
        '4CYgQwf',
        '325139uebZtP'
    ];
    _0x5a77 = function () {
        return _0x34ee00;
    };
    return _0x5a77();
}

可以看到Object的变量名被替换为特殊变量,代码的可读性变差。

  • 禁用控制台输出

我们可以使用disableConsoleOutput来禁用掉console.log输出功能,加大调试难度,示例如下:

const code = `
console.log('hello world')
`
const options = {
  disableConsoleOutput: true
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

运行结果如下:

var _0x2102a4=_0x2ee5;(function(_0x2523e4,_0x50adc4){var _0x36a6f5=_0x2ee5,_0x122ebb=_0x2523e4();while(!![]){try{var _0x5502d9=parseInt(_0x36a6f5(0x159))/0x1*(parseInt(_0x36a6f5(0x15b))/0x2)+parseInt(_0x36a6f5(0x168))/0x3*(parseInt(_0x36a6f5(0x167))/0x4)+-parseInt(_0x36a6f5(0x16a))/0x5+-parseInt(_0x36a6f5(0x15d))/0x6*(-parseInt(_0x36a6f5(0x162))/0x7)+parseInt(_0x36a6f5(0x166))/0x8+parseInt(_0x36a6f5(0x164))/0x9+parseInt(_0x36a6f5(0x16b))/0xa*(-parseInt(_0x36a6f5(0x16c))/0xb);if(_0x5502d9===_0x50adc4)break;else _0x122ebb['push'](_0x122ebb['shift']());}catch(_0x85f884){_0x122ebb['push'](_0x122ebb['shift']());}}}(_0x5a6e,0x8c312));var _0x3537a0=(function(){var _0x4778b4=!![];return function(_0x1b7965,_0x3b16f8){var _0x38da34=_0x4778b4?function(){var _0x4e48ef=_0x2ee5;if(_0x3b16f8){var _0x164ad8=_0x3b16f8[_0x4e48ef(0x161)](_0x1b7965,arguments);return _0x3b16f8=null,_0x164ad8;}}:function(){};return _0x4778b4=![],_0x38da34;};}()),_0x1ec86f=_0x3537a0(this,function(){var _0x6679e8=_0x2ee5,_0x87c2e0=function(){var _0x3dec91=_0x2ee5,_0x5ab99c;try{_0x5ab99c=Function(_0x3dec91(0x15f)+_0x3dec91(0x16d)+');')();}catch(_0x240b7f){_0x5ab99c=window;}return _0x5ab99c;},_0x4f4cd5=_0x87c2e0(),_0x2400b0=_0x4f4cd5[_0x6679e8(0x169)]=_0x4f4cd5[_0x6679e8(0x169)]||{},_0x37f40a=[_0x6679e8(0x165),_0x6679e8(0x160),_0x6679e8(0x15e),_0x6679e8(0x163),'exception',_0x6679e8(0x15c),_0x6679e8(0x158)];for(var _0x56a6e1=0x0;_0x56a6e1<_0x37f40a[_0x6679e8(0x170)];_0x56a6e1++){var _0x5edc8b=_0x3537a0['constructor'][_0x6679e8(0x15a)][_0x6679e8(0x16e)](_0x3537a0),_0xfe5de9=_0x37f40a[_0x56a6e1],_0x4505a3=_0x2400b0[_0xfe5de9]||_0x5edc8b;_0x5edc8b[_0x6679e8(0x16f)]=_0x3537a0['bind'](_0x3537a0),_0x5edc8b[_0x6679e8(0x157)]=_0x4505a3[_0x6679e8(0x157)][_0x6679e8(0x16e)](_0x4505a3),_0x2400b0[_0xfe5de9]=_0x5edc8b;}});function _0x2ee5(_0x475530,_0xaf728c){var _0x2aa2b1=_0x5a6e();return _0x2ee5=function(_0x1ec86f,_0x3537a0){_0x1ec86f=_0x1ec86f-0x157;var _0x3160ba=_0x2aa2b1[_0x1ec86f];return _0x3160ba;},_0x2ee5(_0x475530,_0xaf728c);}_0x1ec86f(),console[_0x2102a4(0x165)]('hello\x20world');function _0x5a6e(){var _0x2bd5f5=['console','340725zAoDtK','2340mHCxzs','170049QFdlAw','{}.constructor(\x22return\x20this\x22)(\x20)','bind','__proto__','length','toString','trace','827903DGttIy','prototype','2ZKfJGV','table','6EoTEXw','info','return\x20(function()\x20','warn','apply','7548023dMilus','error','3236121MZEjtb','log','6974296tynSuQ','4488916UjJrGf','3bPnCdf'];_0x5a6e=function(){return _0x2bd5f5;};return _0x5a6e();}

此时,我们如果执行这段代码,发现没有任何输出,这里实际就是将console的一些功能禁用了。

  • 调试保护

我们知道,如果在JavaScript代码中加入debugger关键字,那么执行到该位置的时候,就会进入断点调试模式。如果在代码多个位置都加入debugger关键字,或者定义某个逻辑来反复执行debugger,就会不断进入断点调试模式,原本的代码就无法顺畅执行了。这个过程可以称为调试保护,即通过反复执行debugger来使得原来的代码无法顺畅执行。

其效果类似于执行了如下代码:

setInterval(() => {debugger;}, 3000)

如果我们把这段代码粘贴到控制台,它就会反复执行debugger语句,进入断点调试模式,从而干扰正常的调试流程。

在javascript-obfuscator中,我们可以使用debugProtection来启用调试保护机制,还可以使用debugProtectionInterval来启用无限调试(debug),使得代码在调试过程中不断进入断点模式,无法顺畅执行,配置如下:

const _0x368f32=_0x3817;(function(_0x3427f4,_0xaa7d82){const _0x5c7233=_0x3817,_0x282a31=_0x3427f4();while(!![]){try{const _0x5cef06=-parseInt(_0x5c7233(0x8e))/0x1+parseInt(_0x5c7233(0x86))/0x2*(-parseInt(_0x5c7233(0x82))/0x3)+-parseInt(_0x5c7233(0x79))/0x4+-parseInt(_0x5c7233(0x80))/0x5*(-parseInt(_0x5c7233(0x77))/0x6)+parseInt(_0x5c7233(0x89))/0x7+parseInt(_0x5c7233(0x8a))/0x8+-parseInt(_0x5c7233(0x81))/0x9*(-parseInt(_0x5c7233(0x85))/0xa);if(_0x5cef06===_0xaa7d82)break;else _0x282a31['push'](_0x282a31['shift']());}catch(_0x404c9f){_0x282a31['push'](_0x282a31['shift']());}}}(_0x934f,0x36b4b));const _0x3b94e2=(function(){let _0x12f64e=!![];return function(_0x57d7a0,_0x483a5e){const _0x4e8a30=_0x12f64e?function(){const _0x32b0c4=_0x3817;if(_0x483a5e){const _0x8198dc=_0x483a5e[_0x32b0c4(0x88)](_0x57d7a0,arguments);return _0x483a5e=null,_0x8198dc;}}:function(){};return _0x12f64e=![],_0x4e8a30;};}());function _0x3817(_0x258a52,_0x36b674){const _0x1ab7cf=_0x934f();return _0x3817=function(_0x1d4554,_0x3b94e2){_0x1d4554=_0x1d4554-0x77;let _0x934ff5=_0x1ab7cf[_0x1d4554];return _0x934ff5;},_0x3817(_0x258a52,_0x36b674);}(function(){_0x3b94e2(this,function(){const _0x2daf5e=_0x3817,_0x9d16e7=new RegExp(_0x2daf5e(0x8b)),_0x4c39ae=new RegExp(_0x2daf5e(0x78),'i'),_0x29fce1=_0x1d4554(_0x2daf5e(0x8c));!_0x9d16e7['test'](_0x29fce1+_0x2daf5e(0x7f))||!_0x4c39ae['test'](_0x29fce1+_0x2daf5e(0x84))?_0x29fce1('0'):_0x1d4554();})();}());function _0x934f(){const _0x21927d=['1208421IXtqKl','action','input','195110ErSnZQ','2EdkfEw','debu','apply','2940679MgffpO','1336888tGNuYo','function\x20*\x5c(\x20*\x5c)','init','call','106462FFldAH','log','6gKHOyW','\x5c+\x5c+\x20*(?:[a-zA-Z_$][0-9a-zA-Z_$]*)','1420616mjWpZp','stateObject','while\x20(true)\x20{}','gger','constructor','counter','chain','1823565GBzQuK','63PEYKVn'];_0x934f=function(){return _0x21927d;};return _0x934f();}for(let i=0x0;i<0x5;i++){console[_0x368f32(0x8f)]('i',i);}function _0x1d4554(_0xad2192){function _0x4a8589(_0x26058b){const _0x193dae=_0x3817;if(typeof _0x26058b==='string')return function(_0x376fd7){}['constructor'](_0x193dae(0x7b))[_0x193dae(0x88)](_0x193dae(0x7e));else(''+_0x26058b/_0x26058b)['length']!==0x1||_0x26058b%0x14===0x0?function(){return!![];}[_0x193dae(0x7d)](_0x193dae(0x87)+_0x193dae(0x7c))[_0x193dae(0x8d)](_0x193dae(0x83)):function(){return![];}[_0x193dae(0x7d)]('debu'+_0x193dae(0x7c))[_0x193dae(0x88)](_0x193dae(0x7a));_0x4a8589(++_0x26058b);}try{if(_0xad2192)return _0x4a8589;else _0x4a8589(0x0);}catch(_0x348a52){}}

  • 域名锁定

还可以通过控制domainLock来控制JavaScript代码只能在特定域名下运行,这样就可以降低代码被模拟或盗用的风险。

示例如下:

const code = `
console.log('hello world')
`

const options = {
  domainLock: ['cuiqingcai.com']
}

const obfuscator = require('javascript-obfuscator')

function obfuscate(code, options) {
  return obfuscator.obfuscate(code, options).getObfuscatedCode()
}

console.log(obfuscate(code, options))

这里我们使用domainLock指定一个域名cuiqingcai.com,也就是设置了一个域名白名单,混淆后的代码结果如下:

var _0x589dd8=_0x2d78;function _0x2d78(_0x47b956,_0x27bf9f){var _0x322302=_0x4ea5();return _0x2d78=function(_0x2160d5,_0x3049cc){_0x2160d5=_0x2160d5-0x16d;var _0x135bae=_0x322302[_0x2160d5];return _0x135bae;},_0x2d78(_0x47b956,_0x27bf9f);}(function(_0x47a2b8,_0x37e8f9){var _0xe5499b=_0x2d78,_0x1cabdc=_0x47a2b8();while(!![]){try{var _0x1ef79d=parseInt(_0xe5499b(0x173))/0x1+-parseInt(_0xe5499b(0x176))/0x2*(-parseInt(_0xe5499b(0x16d))/0x3)+-parseInt(_0xe5499b(0x180))/0x4+parseInt(_0xe5499b(0x16e))/0x5+parseInt(_0xe5499b(0x17e))/0x6+parseInt(_0xe5499b(0x177))/0x7*(-parseInt(_0xe5499b(0x179))/0x8)+-parseInt(_0xe5499b(0x175))/0x9;if(_0x1ef79d===_0x37e8f9)break;else _0x1cabdc['push'](_0x1cabdc['shift']());}catch(_0x54204c){_0x1cabdc['push'](_0x1cabdc['shift']());}}}(_0x4ea5,0x9f9ea));var _0x3049cc=(function(){var _0x3f8a2b=!![];return function(_0x186865,_0x1497e8){var _0x4fffaf=_0x3f8a2b?function(){var _0x231187=_0x2d78;if(_0x1497e8){var _0x3bedd8=_0x1497e8[_0x231187(0x174)](_0x186865,arguments);return _0x1497e8=null,_0x3bedd8;}}:function(){};return _0x3f8a2b=![],_0x4fffaf;};}()),_0x2160d5=_0x3049cc(this,function(){var _0x51cf3e=_0x2d78,_0x2f9734=function(){var _0x4dfbff=_0x2d78,_0x1e30e7;try{_0x1e30e7=Function(_0x4dfbff(0x17a)+_0x4dfbff(0x17d)+');')();}catch(_0x5a8abd){_0x1e30e7=window;}return _0x1e30e7;},_0x2a11e6=_0x2f9734(),_0x597b09=new RegExp(_0x51cf3e(0x183),'g'),_0x232a84=_0x51cf3e(0x17c)[_0x51cf3e(0x178)](_0x597b09,'')[_0x51cf3e(0x17b)](';'),_0x43e98e,_0x559142,_0x50bd03,_0x2e039e,_0x244d9b=function(_0x43517a,_0x4e52f4,_0x1634dc){var _0x127a9d=_0x51cf3e;if(_0x43517a[_0x127a9d(0x171)]!=_0x4e52f4)return![];for(var _0x10edb7=0x0;_0x10edb7<_0x4e52f4;_0x10edb7++){for(var _0x202b79=0x0;_0x202b79<_0x1634dc[_0x127a9d(0x171)];_0x202b79+=0x2){if(_0x10edb7==_0x1634dc[_0x202b79]&&_0x43517a[_0x127a9d(0x181)](_0x10edb7)!=_0x1634dc[_0x202b79+0x1])return![];}}return!![];},_0xf4a5ae=function(_0x3e5617,_0x53d5db,_0x58ab24){return _0x244d9b(_0x53d5db,_0x58ab24,_0x3e5617);},_0x2fc234=function(_0x2d27b6,_0x2a93f3,_0xa5fb12){return _0xf4a5ae(_0x2a93f3,_0x2d27b6,_0xa5fb12);},_0x4d5190=function(_0x112bb6,_0x2fb813,_0x3ae06d){return _0x2fc234(_0x2fb813,_0x3ae06d,_0x112bb6);};for(var _0x4aedca in _0x2a11e6){if(_0x244d9b(_0x4aedca,0x8,[0x7,0x74,0x5,0x65,0x3,0x75,0x0,0x64])){_0x43e98e=_0x4aedca;break;}}for(var _0xcbf864 in _0x2a11e6[_0x43e98e]){if(_0x4d5190(0x6,_0xcbf864,[0x5,0x6e,0x0,0x64])){_0x559142=_0xcbf864;break;}}for(var _0x392075 in _0x2a11e6[_0x43e98e]){if(_0x2fc234(_0x392075,[0x7,0x6e,0x0,0x6c],0x8)){_0x50bd03=_0x392075;break;}}if(!('~'>_0x559142))for(var _0x382f12 in _0x2a11e6[_0x43e98e][_0x50bd03]){if(_0xf4a5ae([0x7,0x65,0x0,0x68],_0x382f12,0x8)){_0x2e039e=_0x382f12;break;}}if(!_0x43e98e||!_0x2a11e6[_0x43e98e])return;var _0xa19a9b=_0x2a11e6[_0x43e98e][_0x559142],_0x119b47=!!_0x2a11e6[_0x43e98e][_0x50bd03]&&_0x2a11e6[_0x43e98e][_0x50bd03][_0x2e039e],_0x15b5d4=_0xa19a9b||_0x119b47;if(!_0x15b5d4)return;var _0x154fe5=![];for(var _0x27c485=0x0;_0x27c485<_0x232a84[_0x51cf3e(0x171)];_0x27c485++){var _0x559142=_0x232a84[_0x27c485],_0x31b39f=_0x559142[0x0]===String['fromCharCode'](0x2e)?_0x559142[_0x51cf3e(0x184)](0x1):_0x559142,_0x4e0de7=_0x15b5d4[_0x51cf3e(0x171)]-_0x31b39f[_0x51cf3e(0x171)],_0x121706=_0x15b5d4[_0x51cf3e(0x182)](_0x31b39f,_0x4e0de7),_0x60443c=_0x121706!==-0x1&&_0x121706===_0x4e0de7;_0x60443c&&((_0x15b5d4[_0x51cf3e(0x171)]==_0x559142[_0x51cf3e(0x171)]||_0x559142[_0x51cf3e(0x182)]('.')===0x0)&&(_0x154fe5=!![]));}if(!_0x154fe5){var _0x5e1b5f=new RegExp(_0x51cf3e(0x16f),'g'),_0x23031c=_0x51cf3e(0x17f)[_0x51cf3e(0x178)](_0x5e1b5f,'');_0x2a11e6[_0x43e98e][_0x50bd03]=_0x23031c;}});function _0x4ea5(){var _0x50c27c=['13119lwLMyn','6278280HEebXS','[dQTZKevHZpcChQTvAQYpScST]','log','length','hello\x20world','942985MpbBIJ','apply','20740851Upfdlp','526lrYFza','14OdaaIU','replace','1532632YoqplN','return\x20(function()\x20','split','kvcuAtFiqreinCwgVycai.czoAmXyetftvQFeYkzYbzwOehtfT','{}.constructor(\x22return\x20this\x22)(\x20)','3591678pQUkUO','adQTbZouKetvHZpcC:bhQlaTvAQnkYpScST','2423416EImzoL','charCodeAt','indexOf','[kvAtFreCwVyzAXyetftvQFeYkzYbzwOehtfT]','slice'];_0x4ea5=function(){return _0x50c27c;};return _0x4ea5();}_0x2160d5(),console[_0x589dd8(0x170)](_0x589dd8(0x172));

这段代码就只能在指定域名cuiqingcai.com下运行,不能在其他网站运行。这样的话,如果一些相关JavaScript代码被单独剥离出来,想在其他网站运行或者使用程序模拟运行的话,运行结果只有失败,这样就可以有效降低代码被模拟或盗用的风险。

  • 特殊编码
var a = 1

jsfuck在线工具的网站如下图所示:

JavaScript逆向爬虫-LMLPHP

使用jsfuck工具的结果:

[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((!![]+[])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+([][[]]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+!+[]]+(+[![]]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+!+[]]]+(!![]+[])[!+[]+!+[]+!+[]]+(+(!+[]+!+[]+!+[]+[+!+[]]))[(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+
...
[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(!![]+[])[+[]]+([][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]()[+!+[]+[+!+[]]]+(+[![]]+[][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[+!+[]]+(!![]+[])[+[]]])[+!+[]+[+!+[]]]+[+!+[]]+(![]+[])[+[]]+([][[]]+[])[!+[]+!+[]]+(![]+[])[+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(![]+[])[+[]]+([][[]]+[])[!+[]+!+[]]+(![]+[])[+!+[]])

使用aaencode工具的结果:

在线网址:https://utf-8.jp/public/aaencode.html

JavaScript逆向爬虫-LMLPHP

使用jjencode工具的结果:

在线网址:https://utf-8.jp/public/jjencode.html

JavaScript逆向爬虫-LMLPHP

通过这些工具进行混淆,虽然看起来没有什么头绪,但实际上找到规律是非常好还原的,并没有真正达到强力混淆的效果。

04-06 21:29