back

python workout: exercise 6

pig latin sentence

problem

Now that you’ve successfully written a translator for a single English word, let’s make things more difficult: translate a series of English words into Pig Latin. Write a func- tion called pl_sentence that takes a string containing several words, separated by spaces. (To make things easier, we won’t actually ask for a real sentence. More specifi- cally, there will be no capital letters or punctuation.)

So, if someone were to call

pl_sentence('this is a test translation')

the output would be

histay isway away estay ranslationtay

Print the output on a single line, rather than with each word on a separate line.

attempts

The first thing that comes to mind is to use the pig_latin function from last exercise’s solution to map over the input sentence to pl_sentence split over spaces:

def pig_latin(word):
    if word[0] in 'aeiou':
        return f'{word}way'
    return f'{word[1:]}{word[0]}ay'

def pl_sentence(sentence: str) -> str:
    return " ".join(map(pig_latin, sentence.split(" ")))

print(pl_sentence("this is a test translation"))
histay isway away esttay ranslationtay

We could also avoid using map and use a list comprehension:

" ".join([pig_latin(word) for word in sentence.split(" ")])

It’s clearer without being much more verbose. So it’s arguably better.

solution

I didn’t know that passing no argument to str.split would split string between any and all whitespace.

And avoiding building larger strings with += given strings are immutable is another nice takeaway.

As for the solution itself:

def pl_sentence(sentence):
    output = []
    for word in sentence.split():
        if word[0] in 'aeiou':
            output.append(f'{word}way')
        else:
            output.append(f'{word[1:]}{word[0]}ay')
    return ' '.join(output)

print(pl_sentence('this is a test'))
histay isway away esttay

I don’t know how I feel about incorporating the pig latin logic in the function directly. But at the same time, our implementatoin is really an instance of a more generic function that applies a transformation to each word of a sentence. Begging the question: why explicitly define a pl_sentence implementatation, instead of something like:

from collections.abc import Callable

def transform_words(transformation: Callable[[str], str]) -> str:
    return lambda sentence: " ".join(map(transformation, sentence.split()))

def pig_latin(word):
    if word[0] in 'aeiou':
        return f'{word}way'
    return f'{word[1:]}{word[0]}ay'

pl_sentence = transform_words(pig_latin)

print(pl_sentence('this is a test'))
histay isway away esttay

beyond the exercise

text file

  • problem

    Take a text file, creating (and printing) a nonsensical sentence from the nth word on each of the first 10 lines, where n is the line number.

  • attempts

    My only concern with this one is how to handle the case when n is larger than the number of words in a line.

    Part of me wants to simply punish the user and raise an exception. I don’t see the need to take the last element; let’s be very literal with this.

    And seeing as Python will already raise an index error exception when we try and retrieve an out of index element, we don’t have to explicitly raise one ourselves.

    It’s lazy. But oh well.

    And while we’re at it, seeing as we’ll need an Apache log file for the final extension exercise, let’s just use the same one for testing here.

    def nonsensical_sentence(path: str) -> None:
        with open(path, 'r') as f:
            sentence = []
            for i in range(10):
                line = f.readline()
                words = line.split()
                sentence.append(words[i+1])
        return " ".join(sentence)
    print(nonsensical_sentence("/files/apache_logs.log"))
    
    - - [17/May/2015:10:05:47 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 52878 "http://semicomplete.com/presentations/logstash-monitorama-2013/"
    

transposition

  • problem

    Write a function that transposes a list of strings, in which each string contains multiple words separated by whitespace. Specifically, it should perform in such a way that if you were to pass the list ['abc def ghi', 'jkl mno pqr', 'stu vwx yz'] to the function, it would return ['abc jkl stu', 'def mno vwx', 'ghi pqr yz'].

  • attempts

    For one reason or another, this reminds me of a generic function from the functional paradigm I saw in SICP. But I can’t remember it :(

    Looking at the input and output, we just need to group the 1st element of each list, the 2nd element of each list…

    Wait!

    It’s just zip!

    We just need to zip the split strings in the input list and collect them:

    def transpose_lists(lists: list[str]) -> list[str]:
        return [list(strings) for strings in zip(*list(map(lambda s: s.split(), lists)))]
    
    print(transpose_lists(['abc def ghi', 'jkl mno pqr', 'stu vwx yz']))
    

    This works. But it’s a bit dense for my liking.

    I like the zip approach, I just think the use of map inside is a bit excessive; too much function composition for python’s aesthetics.

    So let’s just bring that part out:

    def transpose_lists(lists: list[str]) -> list[str]:
        split_strings = [string.split() for string in lists]
        return [list(strings) for strings in zip(*split_strings)]
    
    print(transpose_lists(['abc def ghi', 'jkl mno pqr', 'stu vwx yz']))
    
    [['abc', 'jkl', 'stu'], ['def', 'mno', 'vwx'], ['ghi', 'pqr', 'yz']]
    

    Let’s leave it at that.

reading an apache logfile

  • problem

    Read through an Apache logfile. If there is a 404 error—you can just search for '404', if you want—display the IP address, which should be the first element.

  • attempts

    This should be straightforward:

    • iterate through the logfile line by line
    • test for a 404 in the log message
    • print ip address

    What’s the best way to search for a substring in Python?

    Just in, no?

    Which gives:

    def print_404s(path: str) -> None:
        with open(path, 'r') as f:
            while True:
                line = f.readline()
                if not line:
                    break
                if '404' in line:
                    print(line.split()[0])
    
    print_404s("/files/apache_logs.log")
    
    208.91.156.11
    208.91.156.11
    208.91.156.11
    208.91.156.11
    200.31.173.106
    204.62.56.3
    204.62.56.3
    208.91.156.11
    208.91.156.11
    208.91.156.11
    63.140.98.80
    38.99.236.50
    
mail@jonahv.com