测试代码:
import ray
import time
@ray.remote
class RayMemoryObject:
def __init__(self):
self.data = 1
def get_data(self):
return self.data
class MemoryObject:
def __init__(self):
self.data = 1
def get_data(self):
return self.data
NUM_RUNS = [1, 10, 100, 1000, 5000, 10000, 50000, 100000]
EACH_RUN = 5
if __name__ == "__main__":
ray.init(num_cpus=8)
ray_ref = RayMemoryObject.remote()
mem_obj = MemoryObject()
res = []
for r in NUM_RUNS:
print(f"Testing getting a number for {r} times...")
tmp_data_ray = []
tmp_data_mem = []
for e_r in range(EACH_RUN):
t0 = time.time()
for i in range(r):
ray.get(ray_ref.get_data.remote())
t1 = time.time()
for i in range(r):
mem_obj.get_data()
t2 = time.time()
tmp_data_mem.append(t2 - t1)
tmp_data_ray.append(t1 - t0)
res.append([sum(tmp_data_mem) / len(tmp_data_mem), sum(tmp_data_ray) / len(tmp_data_ray)])
print("%8s | %8s | %8s | %8s" % ("Num Iter", "Phy Mem", "Ray Mem", "Ratio"))
for i in range(len(res)):
print("%8d | %8.2f | %8.2f | %8.2f" % (NUM_RUNS[i], res[i][0], res[i][1], res[i][1] / res[i][0]))
第一个类从Ray的公共内存里面取数,第二种从物理内存里面取数。测试结果如下:
Num Iter | Phy Mem | Ray Mem | Ratio
1 | 0.00 | 0.13 | 27309.15
10 | 0.00 | 0.01 | 498.49
100 | 0.00 | 0.08 | 1223.88
1000 | 0.00 | 0.79 | 1275.69
5000 | 0.00 | 4.16 | 1301.07
10000 | 0.01 | 8.65 | 1679.47
50000 | 0.02 | 41.11 | 1870.96
100000 | 0.04 | 83.57 | 2112.83
从Ray的公有内存里面取数比从物理内存取数满了上千倍。
profile结果(前几行):
80036972 function calls (78347649 primitive calls) in 696.781 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
830555 481.887 0.001 481.887 0.001 {method ‘get_objects’ of ‘ray._raylet.CoreWorker’ objects}
830555 44.821 0.000 44.821 0.000 {method ‘submit_actor_task’ of ‘ray._raylet.CoreWorker’ objects}
830555 16.173 0.000 16.173 0.000 {ray._raylet.split_buffer}
1 14.670 14.670 696.783 696.783 main.py:1()
830556 12.344 0.000 17.479 0.000 inspect.py:2909(_bind)
830555 9.008 0.000 566.797 0.001 worker.py:2205(get)
830555 6.928 0.000 17.581 0.000 {built-in method loads}
1661127/830558 6.815 0.000 591.847 0.001 client_mode_hook.py:96(wrapper)
830555 6.300 0.000 49.202 0.000 serialization.py:341(deserialize_objects)
830555 6.063 0.000 80.557 0.000 actor.py:1109(_actor_method_call)
830555 5.671 0.000 554.389 0.001 worker.py:643(get_objects)
830556 4.984 0.000 29.325 0.000 signature.py:81(flatten_args)
830555 4.745 0.000 9.390 0.000 worker.py:517(get_serialization_context)
830705 4.714 0.000 5.438 0.000 inspect.py:2781(init)
这里{method ‘get_objects’ of ‘ray._raylet.CoreWorker’ objects}耗时很长,是否正常?