Replacing value in text file column with string

I’m having a pretty simple issue. I have a dataset (small sample shown below)

22 85 203 174 9 0 362 40 0
21 87 186 165 5 0 379 32 0
30 107 405 306 25 0 756 99 0
6 5 19 6 2 0 160 9 0
21 47 168 148 7 0 352 29 0
28 38 161 114 10 3 375 40 0
27 218 1522 1328 114 0 1026 310 0
21 78 156 135 5 0 300 27 0

The first issue I needed to cover was replacing each space with a comma I did that with the following code

import fileinput

with open('Data_Sorted.txt', 'w') as f:
    for line in fileinput.input('DATA.dat'):
        line = line.split(None,8)
        f.write(','.join(line))

The result was the following

22,85,203,174,9,0,362,40,0
21,87,186,165,5,0,379,32,0
30,107,405,306,25,0,756,99,0
6,5,19,6,2,0,160,9,0
21,47,168,148,7,0,352,29,0
28,38,161,114,10,3,375,40,0
27,218,1522,1328,114,0,1026,310,0
21,78,156,135,5,0,300,27,0

My next step is to grab the values from the last column, check if they are less than 2 and replace it with the string ‘nfp’.

I’m able to seperate the last column with the following

for line in open("Data_Sorted.txt"):
    columns = line.split(',')

    print columns[8]

My issue is implementing the conditional to replace the value with the string and then I’m not sure how to put the modified column back into the original dataset.

Solution:

There’s no need to do this in two loops through the file. Also, you can use -1 to index the last element in the line.

import fileinput

with open('Data_Sorted.txt', 'w') as f:
    for line in fileinput.input('DATA.dat'):
        # strip newline character and split on whitespace
        line = line.strip().split()

        # check condition for last element (assuming you're using ints)
        if int(line[-1]) < 2:
            line[-1] = 'nfp'

        # write out the line, but you have to add the newline back in
        f.write(','.join(line) + "\n")

Further Reading:

is it possible to write these 2 lines as Ternary Operators?

Example :

if (i < n) 
    i++;
else 
   j += 2;

i did :

i += i < n ? 1 : j += 2; 

(but compiler show error)

how can i write those lines as Ternary Operators if it is possible ..
thanks …

Solution:

(It’s the “conditional” operator. It’s a ternary operator, and for now Java’s only one, but…)

Since you’re not always changing the value of i, and Java doesn’t allow arbitrary expressions as standalone statements (like some other languages, such as JavaScript, do), you can’t rewrite those using the conditional operator unless you give yourself a nop method or something so you can use the conditional in an expression context. Or doing something really convoluted.

There’s also no reason to. if is quite clear.

But if you wanted to, given:

private static void nop(int n) {
}

then

nop(i < n ? (i = i + 1) : (j = j + 2));

There’s also this massively-convoluted way:

i = i < n ? i + 1 : (j = j + 2) == j ? i : 0;

…which just assigns i back to itself if i < n is false, since we know that (j = j + 2) == j will be true.

But again: There’s no reason to.

How to Reverse inequality statement?

If you multiply an inequality by a negative number you must reverse the direction of the inequality.

For example:

  • 1 < x < 6 (1)
  • -1 > -x > -10 (2)

if x = 6, it is consistent with equation (1) and (2).

Is there a way to multiply an inequality statement by an integer in a one-liner for Python to reverse the signs?

From the practical point of view, I am trying to extract DNA/protein sequences from TBLASTN results. There are strands +1 and -1 and the operation after that condition statement is the same.

# one-liner statement I would like to implement
if (start_codon <= coord <= stop_codon)*strand:
    # do operation

# two-liner statement I know would work
if (start_codon <= coord <= stop_codon) and strand==1:
    # do operation
elif (start_codon >= coord >= stop_codon) and strand==-1:
    # do operation

Solution:

You could select the lower and upper bounds based on the strand value. This assumes that strand is always either 1 or -1 and makes use of bool being an int subclass in Python so that True and False can be used to index into pairs:

cdns = (start_codon, stop_codon)
if (cdns[strand==-1] <= coord <= cdns[strand==1]):
    # Python type coercion (True -> 1, False -> 0) in contexts requiring integers

How do you generalise the creation of a list with many variables and conditions of `if`?

I create a list as follows:

['v0' if x%4==0 else 'v1' if x%4==1 else 'v2' if x%4==2 else 'v3' for x in list_1]

How to generalize the creation of such a list, so that it can be easily expanded by a larger number of variables and subsequent conditions?

Solution:

String formatting

Why not use a modulo operation here, and do string formatting, like:

['v{}'.format(x%4) for x in list_1]

Here we thus calculate x%4, and append this to 'v‘ in the string. The nice thing is that we can easily change 4, to another number.

Tuple or list indexing

In case the output string do not follow such structure, we can construct a list or tuple to hold the values. Like:

# in case the values do not follow a certain structure
vals = ('v0', 'v1', 'v2', 'v3')
[vals[x%4] for x in list_1]

By indexing it that way, it is clear what values will map on what index. This works good, given the result of the operation – here x%4 – maps to a n (with a reasonable small n).

(Default)dictionary

In case the operation does not map on a n, but still on a finite number of hashable items, we can use a dictionary. For instance:

d = {0: 'v0', 1: 'v1', 2: 'v2', 3: 'v3'}

or in case we want a “fallback” value, a value that is used given the lookup fails:

from collections import defaultdict

d = defaultdict(lambda: 1234, {0: 'v0', 1: 'v1', 2: 'v2', 3: 'v3'})

where here 1234 is used as a fallback value, and then we can use:

[d[x%4] for x in list_1]

Using d[x%4] over d.get(x%4) given d is a dictionary can be more useful if we want to prevent that lookups that fail pass unnoticed. It will in that case error. Although errors are typically not a good sign, it can be better to raise an error in case the lookup fails than add a default value, since it can be a symptom that something is not working correctly.

How do i have python3 search for a specific string in a text file

All I’m trying to do is have the program look for a specific string and if the file contains that string add to a variable and if not just print that it didn’t work. But it keeps saying that the string was not found when i know it is in the file.

for line in open("/home/cp/Desktop/config"):
    if "PermitRootLogin no" in line:
        num_of_points1 = num_of_points1 + 4
        num_of_vulns1 = num_of_vulns1 + 1
    else:
        print('sshd_config is still vulnerable')
        break

This is the file its reading off of

This should work
Possibly
PermitRootLogin no
hmmmmmmmmmm
aaa

What i want it to do is find that “PermitRootLogin no” in the file but it just keeps acting like that string was never found. And printing the else statement. When i was looking for tutorials they all were trying to do something else so i’m open to any suggestions. Thank You in advanced im just really stuck.

Solution:

Hi your problem is the break after your else.
Your code reads the first line and prints that the first line is not your searched string.
The code below works for me.
Hope this helps.

found = False

for line in open('config'):
    if "PermitRootLogin no" in line:
        num_of_points1 = num_of_points1 + 4
        num_of_vulns1 = num_of_vulns1 + 1
        found = True
if not found:
    print('sshd_config is still vulnerable')

How can I simplify this set of if statements? (Or, what's making it feel so awkward?)

My colleague showed me this piece of code, and we both wondered why there seemed like too much code:

private List<Foo> parseResponse(Response<ByteString> response) {
    if (response.status().code() != Status.OK.code() || !response.payload().isPresent()) {
      if (response.status().code() != Status.NOT_FOUND.code() || !response.payload().isPresent()) {
        LOG.error("Cannot fetch recently played, got status code {}", response.status());
      }
      return Lists.newArrayList();
    }
    // ...
    // ...
    // ...
    return someOtherList;
}

Here’s an alternate representation, to make it a little less wordy:

private void f() {
    if (S != 200 || !P) {
        if (S != 404 || !P) {
            Log();
        }
        return;
    }
    // ...
    // ...
    // ...
    return;
}

Is there a simpler way to write this, without duplicating the !P? If not, is there some unique property about the situation or conditions that makes it impossible to factor out the !P?

Solution:

Unfortunately, the only way I can see to condense the if-statement would be in removing braces as you have single lines.

if (S != 200 || !P) {
    if (S != 404 || !P) Log();
    return A;
}
return B;

If, however, you only had one branch reachable through this statement which was the Log(); statement, you could use the following logical identity to condense the logic (Distributive).

(S != 200 || !P) && (S! = 404 || !P) <=> (S != 200 && S != 404) || !P

If you wanted to only check the condition of truth value of !P once and maintain the same logical outcome, the following would be appropriate.

if (!P) {
    Log();
    return A;
}
if (S != 200) {
    if (S != 404) Log();
    return A;
}
return B;

Or

if (S == 404 && P) return A;
if (S != 200 || !P) {
    Log();
    return A;
}
return B;

Striving to write efficient code

I am writing code for a large scale app, and there is an event where lots of if/else checks are required.

For example:

if (i == 1) { /* some logic */ }
else if (i == 2) { /* some logic */ }
// ...
// ...
else if (i == 1000) { /* some logic */ }

Is there a more efficient or organised way to write this?

Solution:

Sounds like you have a collection of functions and you just need to use a HashMap. i would be index of the map. Hashmaps are useful because they find the corresponding value for a key quickly, without having to compare lots of values until a match is found.

Here is an example of a HashMap from Any to Any => Any. Because Scala supports tuples, this is a completely general solution.

object Hello extends App {
  import scala.collection.immutable

  type F1 = (Any) => Any

  val hashMap: immutable.Map[Int, F1] =
    immutable.HashMap[Int, F1](
      1    -> { (a: Int)               => a * 2               }.asInstanceOf[F1], // Function literal that doubles it's input
      2    -> { (tuple: (String, Int)) => tuple._1 * tuple._2 }.asInstanceOf[F1], // Function literal that repeats the string
      1000 -> { (_: Unit)              => s"The date and time is ${ new java.util.Date() }" }.asInstanceOf[F1]
    )

  def lookup(i: Int, args: Any): Any = hashMap(i)(args)

  def report(i: Int, args: Any): Unit = println(s"$i: ${ lookup(i, args) }")

  report(1, 21)
  report(2, ("word ", 5))
  report(1000, ())
}

Here is the output:

1: 42
2: word word word word word 
1000: The date and time is Sat Dec 23 19:45:56 PST 2017

Update: Here is a version that uses an array. Notice that the indices must start at 0 for this version, not an arbitrary number as before:

object Hello2 extends App {
  type F1 = (Any) => Any

  val array: Array[F1] =
    Array[F1](
      { (a: Int)               => a * 2               }.asInstanceOf[F1], // Function literal that doubles it's input
      { (tuple: (String, Int)) => tuple._1 * tuple._2 }.asInstanceOf[F1], // Function literal that repeats the string
      { (_: Unit)              => s"The date and time is ${ new java.util.Date() }" }.asInstanceOf[F1]
    )

  def lookup(i: Int, args: Any): Any = array(i)(args)

  def report(i: Int, args: Any): Unit = println(s"$i: ${ lookup(i, args) }")

  report(0, 21)
  report(1, ("word ", 5))
  report(2, ())
}

Output is:

0: 42
1: word word word word word 
2: The date and time is Sat Dec 23 20:32:33 PST 2017