combat.py icon indicating copy to clipboard operation
combat.py copied to clipboard

Run with 47K rows, 566 columns and 48 batches takes 5+ hour and it's still running

Open fbrundu opened this issue 12 years ago • 2 comments

I am running combat.py and I don't know if it is normal that with 47K genes, 566 samples and 48 batches it is taking more than 5 hours to terminate. Can I help with something? Data is pretty confidential but I will provide informations if possible.. R version terminates in less than 1 hour. I terminated combat.py, this is the output:

found 48 batches
found 0 numerical covariates...
found 0 categorical variables:

Standardizing Data across genes.
Fitting L/S model and finding priors
Finding parametric adjustments
^CTraceback (most recent call last):
  File "rm_batches.py", line 45, in <module>
    main()
  File "rm_batches.py", line 40, in main
    ebat = combat.combat(dat, bat['Batch'], None)
  File "/home/unsel/Dropbox/poli/pb/class/xeno/combat.py", line 99, in combat
    delta_hat[i], gamma_bar[i], t2[i], a_prior[i], b_prior[i])
  File "/home/unsel/Dropbox/poli/pb/class/xeno/combat.py", line 134, in it_sol
    sum2 = ((sdat - np.dot(g_new.reshape((g_new.shape[0], 1)), np.ones((1, sdat.shape[1])))) ** 2).sum(axis=1)
KeyboardInterrupt
^C
real    2767m37.090s
user    369m50.351s
sys 0m12.489s

Hope it helps..

fbrundu avatar Jul 05 '13 21:07 fbrundu

That doesn't seem unreasonable. I'd let it run for at least 48 hours.

brentp avatar Jul 08 '13 16:07 brentp

Sorry I didn't answer earlier.. why it doesn't seem unreasonable? Is not python version supposed to be faster than R version?

fbrundu avatar Jul 26 '13 15:07 fbrundu