Cobra model parser

Here we can produce dfs relevant for the “modelReaction_data_I” INCA tools input

This example notebook illustrates the use of a parser that prepares part the input for other BFAIR tools, the MFA tools. This will have to be done if you want to run MFA based on a metabolic model that has not been processed before. As of now, processing tools for .json and .sbml files are provided. The procedure is the same for either file type but one example each with the corresponding output is provided.

[1]:
import cobra
import pandas as pd

# BFAIR dependencies
from BFAIR.INCA import parse_cobra_model
Academic license - for non-commercial use only - expires 2021-05-28
Using license file /Users/matmat/gurobi.lic
Determination of memory status is not supported on this
 platform, measuring for memoryleaks will never fail
[2]:
model_data, reaction_data, metabolite_data = parse_cobra_model('data/FIA_MS_example/database_files/iJO1366.json', 'E. coli', '10-03-2021')
[3]:
model_data
[3]:
model_id date model_description model_file file_type
0 E. coli 10-03-2021 {\n"metabolites":[\n{\n"id":"10fthf_c",\n"name... json
[4]:
reaction_data
[4]:
model_id rxn_id rxn_name equation subsystem gpr genes reactants_stoichiometry reactants_ids products_stoichiometry products_ids lower_bound upper_bound objective_coefficient flux_units reversibility used_
0 E. coli EX_cm_e Chloramphenicol exchange cm_e --> Extracellular exchange [] [-1.0] [cm_e] [] [] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
1 E. coli EX_cmp_e CMP exchange cmp_e --> Extracellular exchange [] [-1.0] [cmp_e] [] [] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
2 E. coli EX_co2_e CO2 exchange co2_e <=> Extracellular exchange [] [-1.0] [co2_e] [] [] -1000.0 1000.0 0.0 mmol*gDW-1*hr-1 True True
3 E. coli EX_cobalt2_e Co2+ exchange cobalt2_e <=> Extracellular exchange [] [-1.0] [cobalt2_e] [] [] -1000.0 1000.0 0.0 mmol*gDW-1*hr-1 True True
4 E. coli DM_4crsol_c Sink needed to allow p-Cresol to leave system 4crsol_c --> Intracellular demand [] [-1.0] [4crsol_c] [] [] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2578 E. coli RNDR4 Ribonucleoside-diphosphate reductase (UDP) trdrd_c + udp_c --> dudp_c + h2o_c + trdox_c Nucleotide Salvage Pathway ((b2234 and b2235) and b3781) or ((b2234 and b... [b2234, b3781, b2235, b2582] [-1.0, -1.0] [trdrd_c, udp_c] [1.0, 1.0, 1.0] [dudp_c, h2o_c, trdox_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
2579 E. coli RNDR4b Ribonucleoside-diphosphate reductase (UDP) (gl... grxrd_c + udp_c --> dudp_c + grxox_c + h2o_c Nucleotide Salvage Pathway (b0849 and (b2675 and b2676)) or (b1064 and (b... [b2675, b3610, b0849, b1064, b1654, b2676] [-1.0, -1.0] [grxrd_c, udp_c] [1.0, 1.0, 1.0] [dudp_c, grxox_c, h2o_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
2580 E. coli RNTR1c2 Ribonucleoside-triphosphate reductase (ATP) (f... atp_c + 2.0 flxr_c + 2.0 h_c --> datp_c + 2.0 ... Nucleotide Salvage Pathway (b0684 and b3924 and b4238 and b4237) or (b289... [b0684, b4238, b2895, b4237, b3924] [-1.0, -2.0, -2.0] [atp_c, flxr_c, h_c] [1.0, 2.0, 1.0] [datp_c, flxso_c, h2o_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
2581 E. coli RNTR2c2 Ribonucleoside-triphosphate reductase (GTP) (f... 2.0 flxr_c + gtp_c + 2.0 h_c --> dgtp_c + 2.0 ... Nucleotide Salvage Pathway (b0684 and b3924 and b4238 and b4237) or (b289... [b0684, b4238, b2895, b4237, b3924] [-2.0, -1.0, -2.0] [flxr_c, gtp_c, h_c] [1.0, 2.0, 1.0] [dgtp_c, flxso_c, h2o_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
2582 E. coli RNTR3c2 Ribonucleoside-triphosphate reductase (CTP) (f... ctp_c + 2.0 flxr_c + 2.0 h_c --> dctp_c + 2.0 ... Nucleotide Salvage Pathway (b0684 and b3924 and b4238 and b4237) or (b289... [b0684, b4238, b2895, b4237, b3924] [-1.0, -2.0, -2.0] [ctp_c, flxr_c, h_c] [1.0, 2.0, 1.0] [dctp_c, flxso_c, h2o_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True

2583 rows × 17 columns

[5]:
metabolite_data
[5]:
model_id met_name met_id formula charge compartment bound used_
0 E. coli 10-Formyltetrahydrofolate 10fthf_c C20H21N7O7 -2 c 0.0 True
1 E. coli 1,2-Diacyl-sn-glycerol (didodecanoyl, n-C12:0) 12dgr120_c C27H52O5 0 c 0.0 True
2 E. coli 1,2-Diacyl-sn-glycerol (ditetradecanoyl, n-C14:0) 12dgr140_c C31H60O5 0 c 0.0 True
3 E. coli 1,2-Diacyl-sn-glycerol (ditetradec-7-enoyl, n-... 12dgr141_c C31H56O5 0 c 0.0 True
4 E. coli 1,2-Diacyl-sn-glycerol (dihexadecanoyl, n-C16:0) 12dgr160_c C35H68O5 0 c 0.0 True
... ... ... ... ... ... ... ... ...
1800 E. coli D-Serine ser__D_p C3H7NO3 0 p 0.0 True
1801 E. coli L-Serine ser__L_p C3H7NO3 0 p 0.0 True
1802 E. coli Shikimate skm_p C7H9O5 -1 p 0.0 True
1803 E. coli Selenite slnt_p O3Se -2 p 0.0 True
1804 E. coli Sulfur dioxide so2_p O2S 0 p 0.0 True

1805 rows × 8 columns

[6]:
model_data, reaction_data, metabolite_data = parse_cobra_model('data/FIA_MS_example/database_files/wormjam-20180125.sbml', 'C. elegans', '10-03-2021')
'' is not a valid SBML 'SId'.
[7]:
model_data
[7]:
model_id date model_description model_file file_type
0 C. elegans 10-03-2021 <?xml version='1.0' encoding='UTF-8'?>\n<sbml ... sbml
[8]:
reaction_data
[8]:
model_id rxn_id rxn_name equation subsystem gpr genes reactants_stoichiometry reactants_ids products_stoichiometry products_ids lower_bound upper_bound objective_coefficient flux_units reversibility used_
0 C. elegans ACACCT_c acetoacetyl-CoA:acetate CoA-transferase acac_c + accoa_c <=> aacoa_c + ac_c WBGene00007330 [WBGene00007330] [-1.0, -1.0] [acac_c, accoa_c] [1.0, 1.0] [aacoa_c, ac_c] -1000.0 1000.0 0.0 mmol*gDW-1*hr-1 True True
1 C. elegans ACACCT_m acetoacetyl-CoA:acetate CoA-transferase acac_m + accoa_m <=> aacoa_m + ac_m WBGene00007330 [WBGene00007330] [-1.0, -1.0] [acac_m, accoa_m] [1.0, 1.0] [aacoa_m, ac_m] -1000.0 1000.0 0.0 mmol*gDW-1*hr-1 True True
2 C. elegans 1_2_1_18_RXN_m Malonate-semialdehyde dehydrogenase (acetylating) coa_m + msa_m + nadp_m --> accoa_m + co2_m + n... WBGene00000114 [WBGene00000114] [-1.0, -1.0, -1.0] [coa_m, msa_m, nadp_m] [1.0, 1.0, 1.0] [accoa_m, co2_m, nadph_m] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
3 C. elegans ACYLCOASYN_RXN_c 2,3,4-saturated fatty acyl-CoA synthetase atp_c + coa_c + fatacid_c --> amp_c + fataccoa... WBGene00009218 [WBGene00009218] [-1.0, -1.0, -1.0] [atp_c, coa_c, fatacid_c] [1.0, 1.0, 1.0] [amp_c, fataccoa_c, ppi_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
4 C. elegans ACYLCOASYN_RXN_m 2,3,4-saturated fatty acyl-CoA synthetase atp_m + coa_m + fatacid_m --> amp_m + fataccoa... WBGene00009218 [WBGene00009218] [-1.0, -1.0, -1.0] [atp_m, coa_m, fatacid_m] [1.0, 1.0, 1.0] [amp_m, fataccoa_m, ppi_m] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3296 C. elegans DARDMT_n nuclear desoxyadenosine residue N6 demethylation akg_n + mdadnr_n + o2_n --> co2_n + dadnr_n + ... WBGene00017304 [WBGene00017304] [-1.0, -1.0, -1.0] [akg_n, mdadnr_n, o2_n] [1.0, 1.0, 1.0] [co2_n, dadnr_n, succ_n] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
3297 C. elegans DARMT_n nuclear desoxyadenosine residue N6 methylation amet_n + dadnr_n --> ahcys_n + mdadnr_n WBGene00015939 [WBGene00015939] [-1.0, -1.0] [amet_n, dadnr_n] [1.0, 1.0] [ahcys_n, mdadnr_n] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
3298 C. elegans OH_Exchange_reactions_e OH transport oh_e --> WBGene00007388 [WBGene00007388] [-1.0] [oh_e] [] [] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
3299 C. elegans BIO0020 Assembly of free fatty acids pool for biomass ... 0.0171 arach_c + 1e-05 ddca_c + 1e-05 fa16p1n7... [] [-0.0171, -1e-05, -1e-05, -0.0417, -0.0616, -0... [arach_c, ddca_c, fa16p1n7_c, hdca_c, lnlc_c, ... [1.0] [freefatacid_c] 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True
3300 C. elegans APOACP_SYNTH apo-ACP protein synthesis (from amino acids) 0.090226 alatrna_c + 0.06015 argtrna_c + 0.007... [] [-0.090226, -0.06015, -0.007519, -0.097744, -0... [alatrna_c, argtrna_c, asntrna_c, asptrna_c, a... [0.324, 1.0, 2.0, 2.324, 2.324, 0.090226, 0.06... [adp_c, apoACP_c, gdp_c, h_c, pi_c, trnaala_c,... 0.0 1000.0 0.0 mmol*gDW-1*hr-1 False True

3301 rows × 17 columns

[9]:
metabolite_data
[9]:
model_id met_name met_id formula charge compartment bound used_
0 C. elegans ((N-acetyl-D-glucosaminyl)2-(alpha-D-mannosyl)... n2m2masn_c None 0 cytosol 0.0 True
1 C. elegans ({[(mannosyl),(phosphoethanolaminyl)]-dimannos... mem2emgacpail_c None 0 cytosol 0.0 True
2 C. elegans (1,4-alpha-D-glucosyl)n-glucosyl glucogenin ggn_n None 0 nucleus 0.0 True
3 C. elegans (1,4-alpha-D-glucosyl)n-glucosyl glucogenin ggn_c None 0 cytosol 0.0 True
4 C. elegans (3R)-3-hydroxymyristoyl-[acp] 3hmrsACP_c None 0 cytosol 0.0 True
... ... ... ... ... ... ... ... ...
2388 C. elegans uridine-5'-monophosphate(1−) residue urir_m C9H10N2O8P -1 mitochondrion 0.0 True
2389 C. elegans cytidine 5'-monophosphate(1-) residue cytr_m C9H11N3O7P -1 mitochondrion 0.0 True
2390 C. elegans cytidine 5'-monophosphate(1-) residue cytr_c C9H11N3O7P -1 cytosol 0.0 True
2391 C. elegans Composite of all DNA and RNA for biomass dnarnatotal_c None 0 cytosol 0.0 True
2392 C. elegans freefatacid_c None 0 0.0 True

2393 rows × 8 columns

[ ]: