Creating network tree from 26 json files

ICD 11 data consists of 26 root nodes, each having its own subtree. The data for each subtree/root was downloaded using a recursive algorithm deployed by Dibakar Sigdel. The network graph database was made created using NetworkX, with the following code

trees = []
for data in N:
    G = nx.Graph()
    for item in data:
        item_id = item['id']
        G.add_node(item_id,\
               title=item['title'],\
               defn = item['defn'])
        childs = item['childs']
        if childs!= 'Key Not found':
            for c_id in childs:
                G.add_edge(item_id,c_id, object = 'child')
    trees.append(G)

The JSON files were imported and read into a list, and another list was created which stored the nodes and edges of all the nodes in the 26 trees.

Properties of nodes

  • Item id: ID of the node
  • Title: Name of the node
  • Defn: Description of the node
  • Child: Children of the node (helps in building edges that specify relationship

Tidying data into pandas.dataframe

<<<<<<< HEAD The reason we want to do this is because Dash works well with pandas data.frames.

lists = [] #list storing every single root with its subtree
for G in trees:
    root = list(G.nodes())[0] #stores the root tree
    node2depth = [] 
    #for loop to iterate through every node in the each subtree
    for node in G.nodes():
        depth = nx.shortest_path_length(G, root, node)
        #print(node,"|",depth)
        node2depth.append({"node":node,"depth":depth}) 
    DF = pd.DataFrame(node2depth) #stores node id and corresponding depth level
    DF = DF.set_index("depth")
    lists.append(DF.groupby("depth").count())

Transform data.frame into tidy format (observations in rows, variables in columns)

df = pd.concat(lists, axis=1)
df = df.fillna(0)
df = df.transpose()
df.index = titles
df = df.astype(int)
df2 = df.assign(total_nodes = lambda x: df.sum(axis =1))

======= The reason we want to do this is because Dash works well with pandas data.frames

fdebf356440937f88cd478d30c950f880a9470e1