Memory leak ??

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Kim Petersen

    Memory leak ??

    Memory leak - malloc/free implementation - GC kicking in late - know bug
    - or ?

    Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched

    The following program slowly eats up more and more memory when run on
    large datasets... can anyone tell what the trouble is?

    i've run it up to 240000 recsets so far - and it eats about .1% of my
    mem pr. 1000 (doesn't really matter how much does it?).

    --
    Med Venlig Hilsen / Regards

    Kim Petersen - Kyborg A/S (Udvikling)
    IT - Innovationshuse t
    Havneparken 2
    7100 Vejle
    Tlf. +4576408183 || Fax. +4576408188

    #!/usr/bin/python
    #
    # Created: 13:32 10/07-2003 by Kim Petersen <kp@kyborg.dk >
    #
    # $Id$
    from __future__ import generators
    import gzip
    import re

    err1=re.compile ("^'ERROR:\s+(. *?)' in '(.*)'\s*$")

    def iterator(file):
    buffer=[]
    while 1:
    if not buffer:
    buffer=file.rea dlines(1000)
    line=buffer[0]
    del buffer[0]
    if not line:
    raise
    yield line

    def getrec(lines):
    result=[]
    while 1:
    line=lines.next ().rstrip()
    if not line: break
    result.append(l ine)
    if not result: return None
    (error,dataset) =(result[:-1],eval(result[-1]))
    error=''.join(e rror)[16:]
    return error,dataset

    if __name__ == "__main__":
    import sys

    lines=iterator( gzip.open("erro r.txt.gz"))
    i=0
    while 1:
    if (i%1000)==0:
    sys.stdout.writ e("%-10.10d\r" % (i,))
    sys.stdout.flus h()
    rec=getrec(line s)
    if not rec: break
    (errline,datase t)=rec
    if not err1.match(errl ine):
    sys.stdout.writ e("%s\n" % (errline,))
    sys.stdout.writ e("%-10.10d\r" % (i,))
    sys.stdout.flus h()
    i+=1
    sys.stdout.writ e("%-10.10d\n" % (i,))
    sys.stdout.flus h()

    # Local Variables:
    # tab-width: 3
    # py-indent-offset: 3
    # End:


  • A.M. Kuchling

    #2
    Re: Memory leak ??

    On Thu, 10 Jul 2003 14:34:05 +0200,
    Kim Petersen <kp@kyborg.dk > wrote:[color=blue]
    > Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched
    >
    > The following program slowly eats up more and more memory when run on
    > large datasets... can anyone tell what the trouble is?[/color]

    Your code uses eval(), which is pretty heavyweight because it has to
    tokenize, parse, and then evaluate the string. There have been a few memory
    leaks in eval(), and perhaps you're running into one of them. Try using
    int() or float() to convert strings to numbers instead of eval. As a bonus,
    your program will be faster and much more secure (could an attacker tweak
    your logfiles so you end up eval()ing os.unlink('/etc/passwd')?).

    In general, using eval() is almost always a mistake; few programs need to
    take arbitrary expressions as input.

    --amk

    Comment

    • Kim Petersen

      #3
      Re: Memory leak ?? [resolved - thank you]

      A.M. Kuchling wrote:[color=blue]
      > On Thu, 10 Jul 2003 14:34:05 +0200,
      > Kim Petersen <kp@kyborg.dk > wrote:
      >[color=green]
      >>Using python-2.2.2-26 on RH9 (shrike) x86 -fully patched
      >>
      >>The following program slowly eats up more and more memory when run on
      >>large datasets... can anyone tell what the trouble is?[/color]
      >
      >
      > Your code uses eval(), which is pretty heavyweight because it has to
      > tokenize, parse, and then evaluate the string. There have been a few memory
      > leaks in eval(), and perhaps you're running into one of them. Try using
      > int() or float() to convert strings to numbers instead of eval. As a bonus,
      > your program will be faster and much more secure (could an attacker tweak
      > your logfiles so you end up eval()ing os.unlink('/etc/passwd')?).[/color]

      Thank you very much - it was eval()

      this solved my trouble (calling get_list instead of eval) - is there a
      more generic/efficient way of solving reading a list/expression? (i know
      this one will fail for some strings for instance):

      def get_value(str):
      str=str.strip()
      if str.lower()=='n one':
      return None
      elif str[0] in ['"',"'"]:
      return str[1:-1]
      else:
      if str[-1]=='j':
      return complex(str)
      elif '.' in str or 'e' in str:
      return float(str)
      else:
      return int(str)

      def get_list(str):
      try:
      if str[0]=='(':
      robj=tuple
      else:
      robj=list
      items=str.strip ()[1:-1].split(', ')
      return robj(map(get_va lue,items))
      except:
      traceback.print _exc()
      print str
      return []

      --
      Med Venlig Hilsen / Regards

      Kim Petersen - Kyborg A/S (Udvikling)
      IT - Innovationshuse t
      Havneparken 2
      7100 Vejle
      Tlf. +4576408183 || Fax. +4576408188

      Comment

      Working...