Skip to content

Latest commit

 

History

History
111 lines (103 loc) · 4.71 KB

TODO_old.md

File metadata and controls

111 lines (103 loc) · 4.71 KB

MassTodon todo list

important

  • divide the project into smaller modules:
    • masstodon_bokeh
    • spectrum_bokeh
    • something like that
    • add support for tinker on Linux so that it is easier to actually plot the spectrum
  • save results of the procedures
    • collect the errors in one file
    • aggregate the information on precursors and save it
    • save the estimated intensities and probabilities resulting from parsing
  • update docs and readmes.
  • implement back the multiprocessing version of the software.
    • it is an argument for the graph representation
  • Update to IsoSpec 2.0
  • check different modes of isotopic calculations under IsoSpec2.0
  • add assertions to read_mzxml_spectrum
  • Replace ubiquitin dataset
    • now, the intensity if monotonically increasing
  • RE must parse H-200 -> {'H': -200}
  • Have a look at IsotopeCalculator/simulator.py
    • DO WE USE SPECTRUM CLASS HERE? We certainly could.
  • Spectrum reader must be run without sorting the spectra all the time.
  • Export data to ETDetective
  • Support multiple input precursors!!!
  • To each deconvolution problem add a measure of the problem's complexity
    • for instance in terms of the conditional number of the Gramm matrix.
  • simplify MassTodon Api

elegant

  • there should be a structure taking care of the molecules
    • it might be the precursor itself, but think if it should not be something
    • more general, that could encompass more than one precursor.
  • Set the defaults to the spectrum plot parsing from bash, and types too.
  • add l1 optmization for more robust optimization
  • the adding of spectra is far from optimal:
    • should be some structure that will add them linearly (maintain the order)
    • make it into a separate Python module
  • removing theoretical molecules
    • implement an algorithm that establishes empirical clusters
    • compare the empirical clusters with real datai
  • deconvolution:
    • implement the L1 fit to data using the simplex algorithm
    • implement gaussian kernel approach
    • implement Bayesian setting
    • implement EM settting
  • Memoization of isotopic distributions should be an option, not a must.
  • add another intensity-based criterion here.
    • basically, check how much of a substance could there be, if the was the only possible source of these ions. and accept if it is more than some number. This way silly solutions should be eliminated.

Don't forget

  • To write the results to file only write a method that will generate all the rows and then simply use the general write_from_buffer function.
  • The bloody Proline has precursor.get_AA(4,'C_alpha') == lCnt()
  • The +1H makes part of the c-fragment definition
    • think about the possible products.
  • Get rid of spurious dependencies in setup.py

IsoSpec:

  • Get a version that does not need string parsing.
  • no need to be wrapped in 'cdata2numpyarray'.
  • rounded m/z to some precision.
  • with probabilities not logprobabilities.

Some long term stuff:

  • when more than one precursor
    • exception for the same precursor tags in make_molecules
      • Rationale: otherwise we will not be able to trace the origin of a fragment
      • Problem: what if two substances point to the same formula?
        • check hashes

Deemed stupid

  • implement the additional 1D-regression-test for fragment inclusion.
    • this is unlikely to result in some reduction of the number of possible formulas.

============================================================================

Done:

  • add the plots of
    • fragments probability
    • aggregated fragments intensity
  • eliminate the alphas and raw estimates
    • we are saving them with the reporter class
  • add reading from '.json'
  • Both:
    • D = Deconvolutor(molecules, spectrum, L1_x=0.002, L2_x=0.002, L1_alpha=0.002, L2_alpha=0.002)
    • D = Deconvolutor(molecules, spectrum)
    • work
  • subclass linearCounter into the project and rename it as atom_cnt.
    • add monoisotopic mass
    • add IsotopeCalculator as a subroutine here: not bad! This can be a class field.
      • The Formula class should initiate a field called isotopic generator.
  • Cannot pass different non-default args to mol.isotopologues()
    • e.g. mol.isotopologues(5, .99) seems to loop like hell.
    • might be because of some issues with aggregation.
  • Get rid of highcharts -> turn it into Bokeh
  • Drawing of
    • spectra
    • deconvolution
      • Add the simple plot (observed-predicted) to the module
      • Add the complex plot (observed-predicted-per-molecule)
      • Bash scripts:
        • plot_spectrum to open any mzXml, mzMl, txt spectra file
        • IsoSpecPy plot tools
  • Automate devel set up
  • Add the invisible buffers to Measure.plot()
  • csv:
    • input spectrum
    • save results
    • no pandas