Creating network tree from 26 json files
ICD 11 data consists of 26 root nodes, each having its own subtree. The data for each subtree/root was downloaded using a recursive algorithm deployed by Dibakar Sigdel. The network graph database was made created using NetworkX, with the following code
trees = []
for data in N:
G = nx.Graph()
for item in data:
item_id = item['id']
G.add_node(item_id,\
title=item['title'],\
defn = item['defn'])
childs = item['childs']
if childs!= 'Key Not found':
for c_id in childs:
G.add_edge(item_id,c_id, object = 'child')
trees.append(G)
The JSON files were imported and read into a list, and another list was created which stored the nodes and edges of all the nodes in the 26 trees.
Properties of nodes
- Item id: ID of the node
- Title: Name of the node
- Defn: Description of the node
- Child: Children of the node (helps in building edges that specify relationship
Tidying data into pandas.dataframe
<<<<<<< HEAD The reason we want to do this is because Dash works well with pandas data.frames.
lists = [] #list storing every single root with its subtree
for G in trees:
root = list(G.nodes())[0] #stores the root tree
node2depth = []
#for loop to iterate through every node in the each subtree
for node in G.nodes():
depth = nx.shortest_path_length(G, root, node)
#print(node,"|",depth)
node2depth.append({"node":node,"depth":depth})
DF = pd.DataFrame(node2depth) #stores node id and corresponding depth level
DF = DF.set_index("depth")
lists.append(DF.groupby("depth").count())
Transform data.frame into tidy format (observations in rows, variables in columns)
df = pd.concat(lists, axis=1) df = df.fillna(0) df = df.transpose() df.index = titles df = df.astype(int) df2 = df.assign(total_nodes = lambda x: df.sum(axis =1))
======= The reason we want to do this is because Dash works well with pandas data.frames
fdebf356440937f88cd478d30c950f880a9470e1