Cobra model parser¶
Here we can produce dfs relevant for the “modelReaction_data_I” INCA tools input
This example notebook illustrates the use of a parser that prepares part the input for other BFAIR tools, the MFA tools. This will have to be done if you want to run MFA based on a metabolic model that has not been processed before. As of now, processing tools for .json and .sbml files are provided. The procedure is the same for either file type but one example each with the corresponding output is provided.
[1]:
import cobra
import pandas as pd
# BFAIR dependencies
from BFAIR.INCA import parse_cobra_model
Academic license - for non-commercial use only - expires 2021-05-28
Using license file /Users/matmat/gurobi.lic
Determination of memory status is not supported on this
platform, measuring for memoryleaks will never fail
[2]:
model_data, reaction_data, metabolite_data = parse_cobra_model('data/FIA_MS_example/database_files/iJO1366.json', 'E. coli', '10-03-2021')
[3]:
model_data
[3]:
model_id | date | model_description | model_file | file_type | |
---|---|---|---|---|---|
0 | E. coli | 10-03-2021 | {\n"metabolites":[\n{\n"id":"10fthf_c",\n"name... | json |
[4]:
reaction_data
[4]:
model_id | rxn_id | rxn_name | equation | subsystem | gpr | genes | reactants_stoichiometry | reactants_ids | products_stoichiometry | products_ids | lower_bound | upper_bound | objective_coefficient | flux_units | reversibility | used_ | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | E. coli | EX_cm_e | Chloramphenicol exchange | cm_e --> | Extracellular exchange | [] | [-1.0] | [cm_e] | [] | [] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
1 | E. coli | EX_cmp_e | CMP exchange | cmp_e --> | Extracellular exchange | [] | [-1.0] | [cmp_e] | [] | [] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
2 | E. coli | EX_co2_e | CO2 exchange | co2_e <=> | Extracellular exchange | [] | [-1.0] | [co2_e] | [] | [] | -1000.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | True | True | |
3 | E. coli | EX_cobalt2_e | Co2+ exchange | cobalt2_e <=> | Extracellular exchange | [] | [-1.0] | [cobalt2_e] | [] | [] | -1000.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | True | True | |
4 | E. coli | DM_4crsol_c | Sink needed to allow p-Cresol to leave system | 4crsol_c --> | Intracellular demand | [] | [-1.0] | [4crsol_c] | [] | [] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2578 | E. coli | RNDR4 | Ribonucleoside-diphosphate reductase (UDP) | trdrd_c + udp_c --> dudp_c + h2o_c + trdox_c | Nucleotide Salvage Pathway | ((b2234 and b2235) and b3781) or ((b2234 and b... | [b2234, b3781, b2235, b2582] | [-1.0, -1.0] | [trdrd_c, udp_c] | [1.0, 1.0, 1.0] | [dudp_c, h2o_c, trdox_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
2579 | E. coli | RNDR4b | Ribonucleoside-diphosphate reductase (UDP) (gl... | grxrd_c + udp_c --> dudp_c + grxox_c + h2o_c | Nucleotide Salvage Pathway | (b0849 and (b2675 and b2676)) or (b1064 and (b... | [b2675, b3610, b0849, b1064, b1654, b2676] | [-1.0, -1.0] | [grxrd_c, udp_c] | [1.0, 1.0, 1.0] | [dudp_c, grxox_c, h2o_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
2580 | E. coli | RNTR1c2 | Ribonucleoside-triphosphate reductase (ATP) (f... | atp_c + 2.0 flxr_c + 2.0 h_c --> datp_c + 2.0 ... | Nucleotide Salvage Pathway | (b0684 and b3924 and b4238 and b4237) or (b289... | [b0684, b4238, b2895, b4237, b3924] | [-1.0, -2.0, -2.0] | [atp_c, flxr_c, h_c] | [1.0, 2.0, 1.0] | [datp_c, flxso_c, h2o_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
2581 | E. coli | RNTR2c2 | Ribonucleoside-triphosphate reductase (GTP) (f... | 2.0 flxr_c + gtp_c + 2.0 h_c --> dgtp_c + 2.0 ... | Nucleotide Salvage Pathway | (b0684 and b3924 and b4238 and b4237) or (b289... | [b0684, b4238, b2895, b4237, b3924] | [-2.0, -1.0, -2.0] | [flxr_c, gtp_c, h_c] | [1.0, 2.0, 1.0] | [dgtp_c, flxso_c, h2o_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
2582 | E. coli | RNTR3c2 | Ribonucleoside-triphosphate reductase (CTP) (f... | ctp_c + 2.0 flxr_c + 2.0 h_c --> dctp_c + 2.0 ... | Nucleotide Salvage Pathway | (b0684 and b3924 and b4238 and b4237) or (b289... | [b0684, b4238, b2895, b4237, b3924] | [-1.0, -2.0, -2.0] | [ctp_c, flxr_c, h_c] | [1.0, 2.0, 1.0] | [dctp_c, flxso_c, h2o_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
2583 rows × 17 columns
[5]:
metabolite_data
[5]:
model_id | met_name | met_id | formula | charge | compartment | bound | used_ | |
---|---|---|---|---|---|---|---|---|
0 | E. coli | 10-Formyltetrahydrofolate | 10fthf_c | C20H21N7O7 | -2 | c | 0.0 | True |
1 | E. coli | 1,2-Diacyl-sn-glycerol (didodecanoyl, n-C12:0) | 12dgr120_c | C27H52O5 | 0 | c | 0.0 | True |
2 | E. coli | 1,2-Diacyl-sn-glycerol (ditetradecanoyl, n-C14:0) | 12dgr140_c | C31H60O5 | 0 | c | 0.0 | True |
3 | E. coli | 1,2-Diacyl-sn-glycerol (ditetradec-7-enoyl, n-... | 12dgr141_c | C31H56O5 | 0 | c | 0.0 | True |
4 | E. coli | 1,2-Diacyl-sn-glycerol (dihexadecanoyl, n-C16:0) | 12dgr160_c | C35H68O5 | 0 | c | 0.0 | True |
... | ... | ... | ... | ... | ... | ... | ... | ... |
1800 | E. coli | D-Serine | ser__D_p | C3H7NO3 | 0 | p | 0.0 | True |
1801 | E. coli | L-Serine | ser__L_p | C3H7NO3 | 0 | p | 0.0 | True |
1802 | E. coli | Shikimate | skm_p | C7H9O5 | -1 | p | 0.0 | True |
1803 | E. coli | Selenite | slnt_p | O3Se | -2 | p | 0.0 | True |
1804 | E. coli | Sulfur dioxide | so2_p | O2S | 0 | p | 0.0 | True |
1805 rows × 8 columns
[6]:
model_data, reaction_data, metabolite_data = parse_cobra_model('data/FIA_MS_example/database_files/wormjam-20180125.sbml', 'C. elegans', '10-03-2021')
'' is not a valid SBML 'SId'.
[7]:
model_data
[7]:
model_id | date | model_description | model_file | file_type | |
---|---|---|---|---|---|
0 | C. elegans | 10-03-2021 | <?xml version='1.0' encoding='UTF-8'?>\n<sbml ... | sbml |
[8]:
reaction_data
[8]:
model_id | rxn_id | rxn_name | equation | subsystem | gpr | genes | reactants_stoichiometry | reactants_ids | products_stoichiometry | products_ids | lower_bound | upper_bound | objective_coefficient | flux_units | reversibility | used_ | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | C. elegans | ACACCT_c | acetoacetyl-CoA:acetate CoA-transferase | acac_c + accoa_c <=> aacoa_c + ac_c | WBGene00007330 | [WBGene00007330] | [-1.0, -1.0] | [acac_c, accoa_c] | [1.0, 1.0] | [aacoa_c, ac_c] | -1000.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | True | True | |
1 | C. elegans | ACACCT_m | acetoacetyl-CoA:acetate CoA-transferase | acac_m + accoa_m <=> aacoa_m + ac_m | WBGene00007330 | [WBGene00007330] | [-1.0, -1.0] | [acac_m, accoa_m] | [1.0, 1.0] | [aacoa_m, ac_m] | -1000.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | True | True | |
2 | C. elegans | 1_2_1_18_RXN_m | Malonate-semialdehyde dehydrogenase (acetylating) | coa_m + msa_m + nadp_m --> accoa_m + co2_m + n... | WBGene00000114 | [WBGene00000114] | [-1.0, -1.0, -1.0] | [coa_m, msa_m, nadp_m] | [1.0, 1.0, 1.0] | [accoa_m, co2_m, nadph_m] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
3 | C. elegans | ACYLCOASYN_RXN_c | 2,3,4-saturated fatty acyl-CoA synthetase | atp_c + coa_c + fatacid_c --> amp_c + fataccoa... | WBGene00009218 | [WBGene00009218] | [-1.0, -1.0, -1.0] | [atp_c, coa_c, fatacid_c] | [1.0, 1.0, 1.0] | [amp_c, fataccoa_c, ppi_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
4 | C. elegans | ACYLCOASYN_RXN_m | 2,3,4-saturated fatty acyl-CoA synthetase | atp_m + coa_m + fatacid_m --> amp_m + fataccoa... | WBGene00009218 | [WBGene00009218] | [-1.0, -1.0, -1.0] | [atp_m, coa_m, fatacid_m] | [1.0, 1.0, 1.0] | [amp_m, fataccoa_m, ppi_m] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
3296 | C. elegans | DARDMT_n | nuclear desoxyadenosine residue N6 demethylation | akg_n + mdadnr_n + o2_n --> co2_n + dadnr_n + ... | WBGene00017304 | [WBGene00017304] | [-1.0, -1.0, -1.0] | [akg_n, mdadnr_n, o2_n] | [1.0, 1.0, 1.0] | [co2_n, dadnr_n, succ_n] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
3297 | C. elegans | DARMT_n | nuclear desoxyadenosine residue N6 methylation | amet_n + dadnr_n --> ahcys_n + mdadnr_n | WBGene00015939 | [WBGene00015939] | [-1.0, -1.0] | [amet_n, dadnr_n] | [1.0, 1.0] | [ahcys_n, mdadnr_n] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
3298 | C. elegans | OH_Exchange_reactions_e | OH transport | oh_e --> | WBGene00007388 | [WBGene00007388] | [-1.0] | [oh_e] | [] | [] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | |
3299 | C. elegans | BIO0020 | Assembly of free fatty acids pool for biomass ... | 0.0171 arach_c + 1e-05 ddca_c + 1e-05 fa16p1n7... | [] | [-0.0171, -1e-05, -1e-05, -0.0417, -0.0616, -0... | [arach_c, ddca_c, fa16p1n7_c, hdca_c, lnlc_c, ... | [1.0] | [freefatacid_c] | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True | ||
3300 | C. elegans | APOACP_SYNTH | apo-ACP protein synthesis (from amino acids) | 0.090226 alatrna_c + 0.06015 argtrna_c + 0.007... | [] | [-0.090226, -0.06015, -0.007519, -0.097744, -0... | [alatrna_c, argtrna_c, asntrna_c, asptrna_c, a... | [0.324, 1.0, 2.0, 2.324, 2.324, 0.090226, 0.06... | [adp_c, apoACP_c, gdp_c, h_c, pi_c, trnaala_c,... | 0.0 | 1000.0 | 0.0 | mmol*gDW-1*hr-1 | False | True |
3301 rows × 17 columns
[9]:
metabolite_data
[9]:
model_id | met_name | met_id | formula | charge | compartment | bound | used_ | |
---|---|---|---|---|---|---|---|---|
0 | C. elegans | ((N-acetyl-D-glucosaminyl)2-(alpha-D-mannosyl)... | n2m2masn_c | None | 0 | cytosol | 0.0 | True |
1 | C. elegans | ({[(mannosyl),(phosphoethanolaminyl)]-dimannos... | mem2emgacpail_c | None | 0 | cytosol | 0.0 | True |
2 | C. elegans | (1,4-alpha-D-glucosyl)n-glucosyl glucogenin | ggn_n | None | 0 | nucleus | 0.0 | True |
3 | C. elegans | (1,4-alpha-D-glucosyl)n-glucosyl glucogenin | ggn_c | None | 0 | cytosol | 0.0 | True |
4 | C. elegans | (3R)-3-hydroxymyristoyl-[acp] | 3hmrsACP_c | None | 0 | cytosol | 0.0 | True |
... | ... | ... | ... | ... | ... | ... | ... | ... |
2388 | C. elegans | uridine-5'-monophosphate(1−) residue | urir_m | C9H10N2O8P | -1 | mitochondrion | 0.0 | True |
2389 | C. elegans | cytidine 5'-monophosphate(1-) residue | cytr_m | C9H11N3O7P | -1 | mitochondrion | 0.0 | True |
2390 | C. elegans | cytidine 5'-monophosphate(1-) residue | cytr_c | C9H11N3O7P | -1 | cytosol | 0.0 | True |
2391 | C. elegans | Composite of all DNA and RNA for biomass | dnarnatotal_c | None | 0 | cytosol | 0.0 | True |
2392 | C. elegans | freefatacid_c | None | 0 | 0.0 | True |
2393 rows × 8 columns
[ ]: