Ross Wan's World!

Python, Ajax, PHP and Linux.

Posts Tagged ‘re’

The Python Challenge Lv.10

Posted by Ross Wan 于 2011/09/08

Lv.10

利用第8关的账号和密码进入第10关, 显示是一张黄牛的图片, bull? 图片下面是一个 Python 语句:

len(a[30]) = ?

将鼠标放在牛的身上, 原来是一个链接, 点击打开, 显示一个未完的列表:

a = [1, 11, 21, 1211, 111221,

看来这题的题目就是根据那个未完的列表提示, 推测 a[30] 的长度.

再看看那个未完的列表有什么规律:

a[0] = ‘1’
a[1] = ’11’ 每两个字符拆分理解为: 1个’1′ 等于 a[0]
a[2] = ’21’ 每两个拆分理解为: 1个’2′ 等于 a[1]
a[3] = ‘1211’ 每两个拆分理解为: 1个’2′ + 2个’1′ 等于 a[2]
a[4] = ‘111221’ 第两个字符拆分理解为: 1个’1′ + 1个’2′ + 2个’1′ 等于 a[3]

规律出来了, 剩下的可以交给 Python :)

import re

def repl(match_obj):
    return '%s%s' % (len(match_obj.group()), match_obj.groups()[0])

if __name__ == '__main__':
    a = '1'
    reg = re.compile(r'(\d)\1*')
    for i in range(30):
        a = reg.sub(repl, a)
    print('a[30] = %s\n' % a)
    print('len(a[30]) = %d' % len(a))

可知, len(a[30]) 的答案是5808. 下一关的网址就是: http://www.pythonchallenge.com/pc/return/5808.html

Have fun~~

Advertisements

Posted in Python | Tagged: , , , | Leave a Comment »

The Python Challenge Lv.4

Posted by Ross Wan 于 2011/08/22

Lv.4

打开第4关的网址,得到如下内容:

linkedlist.php

将网址改为 http://www.pythonchallenge.com/pc/def/linkedlist.php, 再打开,终于进入到第4关,看到一张”木偶锯木”的图片,点击它,进入到 http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345 页面显示:

and the next nothing is 44827

根据页面提示,将网址最后的nothing参数改为 44827, 打开http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=44827, 又得到提示:

and the next nothing is 45439

再改网址的nothing参数吧, http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=45439,这次终于得到稍有不同的提示了:

Your hands are getting tired and the next nothing is 94485

这算是安慰的说话吗?想一想,这样重复又重复的工作,不正是程序应做的事情吗?!

#!bin/python3
# coding=utf8

import re, urllib.request

if __name__ == '__main__':
    nothing = '94485'
    while 1:
        response = urllib.request.urlopen('http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=%s' % nothing)
        content = response.read().decode('ascii')
        response.close()
        print(content)
        match_obj = re.match(r'^and the next nothing is (\d+)', content)
        if match_obj:
            nothing = match_obj.groups()[0]
        else:
            break

经过若干次抓取页面之后停住了,提示:

and the next nothing is 16044
Yes. Divide by two and keep going.

意思就是除2再继续,晕~~将上面脚本的 nothing 改为’8022’再次运行.不久,页面又停住了:

and the next nothing is 82682
There maybe misleading numbers in the
text. One example is 82683. Look only for the next nothing and the next nothing
is 63579

提示错误,要从63579再次运行,再次晕倒.那再将上面的脚本中的nothing改为’63579′

再次运行…

and the next nothing is 66831

peak.html

皇天不负有心人,都不知运行多长时间了,终于显示下一关的关键字,真是想哭呀~~~Lv.5:http://www.pythonchallenge.com/pc/def/peak.html

Have fun :~?

Posted in Python | Tagged: , , , | 1 Comment »

The Python Challenge Lv.3

Posted by Ross Wan 于 2011/08/20

Lv.3

One small letter, surrounded by EXACTLY three big bodyguards on each of its sides.

这关跟上一关一样,也是要解密网页源代码里的那一堆字符注释.根据上面的提示,要找出一些特殊的小写字母,它必须左右两边当且只有3个大字母,例如: bXXXaXXXc,字母a就符合要求.利用正则查找出这些字母来:

import re

if __name__ == '__main__':
    with open('mess.txt') as f:
        target_chars = []
        for line in f:
            target_chars.extend(re.findall(r'(?<=[^A-Z][A-Z]{3})[a-z](?=[A-Z]{3}[^A-Z])', '^'+line))
        target = ''.join(target_chars)
        print('Target characters: %s' % target)
        print('Next url: http://www.pythonchallenge.com/pc/def/%s.html' % target)
        f.close()

‘^’+line 在每行开头加上字符”^”(或者其它非大写字母),是为了让正则表达式匹配这个特殊情况,就是行字符串开头如这样:XXXaXXXb…,如果不在行开头加上一个非大写字符,上面的正则表达式是配置不到a字符的.

另外,在论坛区里,有人贴出这样的正则表达式:

r'([^A-Z][A-Z]{3}([a-z])[A-Z]{3}[^A-Z]’

但它会漏掉一些应匹配的小写字符,例如: ‘aXXXbXXXcXXXd’, 上面的正则表达式只匹配到’b’,没有匹配到’c’!(虽然对于本关,它一样可以得出正确的结果 :<)

运行脚本,得到如下结果:

Target characters: linkedlist

Next url: http://www.pythonchallenge.com/pc/def/linkedlist.html

得到下一关的网址: http://www.pythonchallenge.com/pc/def/linkedlist.html

Have fun :>

Posted in Python, Uncategorized | Tagged: , , | Leave a Comment »

The Python Challenge Lv.2

Posted by Ross Wan 于 2011/08/19

Lv.2

recognize the characters. maybe they are in the book, but MAYBE they are in the page source.

上面是第2关的提示.看看这网页的源代码,就知道究竟什么回事.网面源码最后有一大段的注释,一堆的标点符号.提示说”recognize the characters”,就是说这堆标点符号里有有意义的字母,推理方向对了就行.下面是 Python 代码,用正则的方式查找字母出来:

import re

if __name__ == '__main__':
    with open('mess1.txt') as f:
        mess = f.read()
        f.close()
        target = ''.join(re.findall(r'[a-zA-Z0-9]', mess, re.S))
        print('Target characters: %s' % target)
        print('Next url: http://www.pythonchallenge.com/pc/def/%s.html' % target)

mess.txt 是那源代码标点符号注释复制下来保存的文件.下面是输出结果:

Target characters: equality
Next url: http://www.pythonchallenge.com/pc/def/equality.html

可见,下一关的网址就是 http://www.pythonchallenge.com/pc/def/equality.html


Have fun :)

Posted in Python | Tagged: , , , | Leave a Comment »