2020-04
19

秒杀的感觉真爽!

By xrspook @ 20:06:17 归类于: 扮IT

配合我的二分法搜索,10万单词找出397对回文词,我只需1.7秒。list.index()需要291秒,期间如果不输出单词,你绝对认为自己的电脑卡死了!参考答案用了70秒,而且搜出了885对,其中91对准确来说是91个,那些词自己跟自己回文,自己跟自己根本算不上两个词好吗!余下的397对是因为A词和B词算一对,B词和A词他们又输出了一遍。参考答案的语句很精炼,但特殊情况没有处理好。

赢了参考答案,真爽!!!

练习11:两个词如果互为逆序,就称它们是『翻转配对』。写一个函数来找一下在这个词汇表中所有这样的词对。

Exercise 11: Two words are a “reverse pair” if each is the reverse of the other. Write a program that finds all the reverse pairs in the word list. Solution: http://thinkpython2.com/code/reverse_pair.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
import time
def in_bisect(library, first, last, myword): # 二分法搜索,10万数据查询最多只需不到20步
    if first > last: # 这是一句拯救了我的条件
        return -1
    else:
        mid = (first + last)//2
        if myword == library[mid]:
            return mid
        elif library[mid] > myword:
            return in_bisect(library, first, mid-1, myword)
        else:
            return in_bisect(library, mid+1, last, myword)
j = 0
count = 0
library = []
fin = open('words.txt')
for line in fin:
    word = line.strip()
    library.append(word)
library.sort()
start = time.time()
for i in range(len(library)-1): # 二分法搜索 
    j = in_bisect(library, 0, len(library)-1, library[i][::-1])
    if j > -1 and library[i] < library[j]:
        print(library[i], library[j])
        count += 1
# for i in range(len(library)-1): # list.index()搜索
#     if library[i][::-1] in library:
#         j = library.index(library[i][::-1], 0, len(library)-1)
#         if library[i] < library[j]:
#             print(library[i], library[j])
#             j = 0
#             count += 1
print(count)
end = time.time()
print(end - start)
# abut tuba
# ad da
# ados soda
# agar raga
# agas saga
# agenes senega
# ah ha
# aider redia
# airts stria
# ajar raja
# alif fila
# am ma
# amen nema
# amis sima
# an na
# anger regna
# animal lamina
# animes semina
# anon nona
# ante etna
# are era
# ares sera
# aril lira
# arris sirra
# arum mura
# at ta
# ate eta
# ates seta
# auks skua
# avid diva
# avo ova
# ay ya
# bad dab
# bag gab
# bal lab
# bals slab
# ban nab
# bard drab
# bas sab
# bat tab
# bats stab
# bed deb
# ben neb
# bid dib
# big gib
# bin nib
# bins snib
# bird drib
# bis sib
# bog gob
# bos sob
# bots stob
# bows swob
# brad darb
# brag garb
# bud dub
# bun nub
# buns snub
# bur rub
# burd drub
# burg grub
# bus sub
# but tub
# buts stub
# cam mac
# cap pac
# cares serac
# cod doc
# cram marc
# cud duc
# dag gad
# dah had
# dahs shad
# dam mad
# dap pad
# dart trad
# daw wad
# debut tubed
# decal laced
# dedal laded
# deem meed
# deep peed
# deeps speed
# deer reed
# dees seed
# defer refed
# degami imaged
# deifier reified
# deil lied
# deke eked
# del led
# delf fled
# deliver reviled
# dels sled
# demit timed
# denier reined
# denies seined
# denim mined
# dens sned
# depot toped
# depots stoped
# derat tared
# derats stared
# dessert tressed
# desserts stressed
# devas saved
# devil lived
# dew wed
# dewans snawed
# dexes sexed
# dial laid
# dialer relaid
# diaper repaid
# dig gid
# dim mid
# dinar ranid
# diols sloid
# dirts strid
# do od
# dog god
# dom mod
# don nod
# doom mood
# door rood
# dor rod
# dormin nimrod
# dorp prod
# dos sod
# dot tod
# drail liard
# draw ward
# drawer reward
# draws sward
# dray yard
# dual laud
# ducs scud
# duel leud
# duo oud
# dup pud
# dups spud
# eat tae
# edile elide
# edit tide
# eel lee
# eh he
# elides sedile
# em me
# emes seme
# emir rime
# emit time
# emits stime
# enol lone
# er re
# ergo ogre
# eros sore
# ervil livre
# etas sate
# even neve
# evil live
# eviler relive
# fer ref
# fires serif
# flog golf
# flow wolf
# fool loof
# gal lag
# gals slag
# gam mag
# gan nag
# gar rag
# gas sag
# gat tag
# gats stag
# gel leg
# gelder redleg
# get teg
# gip pig
# girt trig
# gnar rang
# gnat tang
# gnats stang
# gnaws swang
# gnus sung
# got tog
# gul lug
# gulp plug
# guls slug
# gum mug
# gums smug
# guns snug
# gut tug
# habus subah
# hahs shah
# hales selah
# hap pah
# hay yah
# hey yeh
# ho oh
# hoop pooh
# hop poh
# is si
# it ti
# jar raj
# kay yak
# keel leek
# keels sleek
# keep peek
# keets steek
# kips spik
# knaps spank
# knar rank
# knits stink
# lager regal
# lair rial
# lap pal
# lares seral
# larum mural
# las sal
# leer reel
# lees seel
# leets steel
# leper repel
# lever revel
# levins snivel
# liar rail
# lin nil
# lion noil
# lit til
# lobo obol
# loom mool
# loons snool
# loop pool
# loops spool
# loot tool
# looter retool
# loots stool
# lop pol
# lotos sotol
# macs scam
# maes seam
# map pam
# mar ram
# marcs scram
# mart tram
# mat tam
# maws swam
# may yam
# meet teem
# meter retem
# mho ohm
# mils slim
# mir rim
# mis sim
# mon nom
# moor room
# moot toom
# mot tom
# mures serum
# mus sum
# muts stum
# namer reman
# nap pan
# naps span
# neep peen
# net ten
# neves seven
# new wen
# nip pin
# nips spin
# nit tin
# no on
# nolos solon
# nos son
# not ton
# notes seton
# now won
# nu un
# nus sun
# nut tun
# nuts stun
# oat tao
# oohs shoo
# oot too
# os so
# ow wo
# pacer recap
# pals slap
# pans snap
# par rap
# part trap
# parts strap
# pas sap
# pat tap
# paw wap
# paws swap
# pay yap
# peels sleep
# pees seep
# per rep
# pets step
# pins snip
# pis sip
# pit tip
# pols slop
# pools sloop
# poons snoop
# port trop
# ports strop
# pot top
# pots stop
# pow wop
# pows swop
# prat tarp
# pupils slipup
# puris sirup
# pus sup
# put tup
# raps spar
# rat tar
# rats star
# raw war
# ray yar
# rebus suber
# rebut tuber
# recaps spacer
# redes seder
# redips spider
# redraw warder
# redrawer rewarder
# rees seer
# reflet telfer
# reflow wolfer
# reknit tinker
# reknits stinker
# relit tiler
# remeet teemer
# remit timer
# rennet tenner
# repins sniper
# res ser
# rot tor
# sallets stellas
# saps spas
# sat tas
# saw was
# scares seracs
# secret terces
# seeks skees
# selahs shales
# sirs sris
# sit tis
# six xis
# skeets steeks
# skips spiks
# sleeps speels
# sleets steels
# slit tils
# sloops spools
# smart trams
# smuts stums
# snaps spans
# snaw wans
# snaws swans
# snips spins
# snit tins
# snoops spoons
# snoot toons
# snot tons
# snow wons
# sow wos
# spat taps
# spay yaps
# spirt trips
# spirts strips
# spit tips
# sports strops
# spot tops
# spots stops
# sprat tarps
# sprits stirps
# staw wats
# stew wets
# stow wots
# stows swots
# straw warts
# strow worts
# struts sturts
# swat taws
# sway yaws
# swot tows
# tav vat
# taw wat
# tew wet
# tort trot
# tow wot
# trow wort
# way yaw
# tort trot 
# tow wot
# trow wort
# way yaw
# 397, 291.1146504878998 # list.index()搜索
# 397, 1.7120981216430664 # 二分法搜索
# 885, 70.3680248260498 # 参考答案运行结果
2020-04
15

反正这是我的答案

By xrspook @ 19:44:55 归类于: 扮IT

题目摆在这里,没有确切的答案,下面是我的解答,对不对不知道。words.txt资源在这里。

There are solutions to these exercises in the next section. You should at least attempt each one before you read the solutions.

Exercise 1: Write a program that reads words.txt and prints only the words with more than 20 characters (not counting whitespace).

Exercise 2: In 1939 Ernest Vincent Wright published a 50,000 word novel called Gadsby that does not contain the letter “e”. Since “e” is the most common letter in English, that’s not easy to do. In fact, it is difficult to construct a solitary thought without using that most common symbol. It is slow going at first, but with caution and hours of training you can gradually gain facility. All right, I’ll stop now. Write a function called has_no_e that returns True if the given word doesn’t have the letter “e” in it. Write a program that reads words.txt and prints only the words that have no “e”. Compute the percentage of words in the list that have no “e”.

Exercise 3: Write a function named avoids that takes a word and a string of forbidden letters, and that returns True if the word doesn’t use any of the forbidden letters. Write a program that prompts the user to enter a string of forbidden letters and then prints the number of words that don’t contain any of them. Can you find a combination of 5 forbidden letters that excludes the smallest number of words?

Exercise 4: Write a function named uses_only that takes a word and a string of letters, and that returns True if the word contains only letters in the list. Can you make a sentence using only the letters acefhlo? Other than “Hoe alfalfa”?

Exercise 5: Write a function named uses_all that takes a word and a string of required letters, and that returns True if the word uses all the required letters at least once. How many words are there that use all the vowels aeiou? How about aeiouy?

Exercise 6: Write a function called is_abecedarian that returns True if the letters in a word appear in alphabetical order (double letters are ok). How many abecedarian words are there?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
fin = open('words.txt') # 第1小问
for line in fin:
    if len(line) >= (20+2):
        word = line.strip()
        print(word)
# counterdemonstrations
# hyperaggressivenesses
# microminiaturizations
 
def has_no_e(word): # 第2小问
    for letter in word:
        if letter == 'e':
            return False
    return True
fin = open('words.txt')
all = 0
count = 0
for line in fin:
    word = line.strip()
    all = all + 1
    if has_no_e(word):
        print(word)
        count = count + 1
print(count, 'words without e')
print('{:.0%}'.format(count/all), 'words without e')
# ...
# zymosis
# zymotic
# zymurgy
# 37641 words without e
# 33% words without e
 
def avoids(word, x): # 第3小问,最后一个问题举手投降
    for letterw in word:
        for letterx in x:
            if letterw == letterx:
                return False
    return True
fin = open('words.txt')
x = input('withtout: ')
num = 0
# word = 'jwrojgre' # input('word is ')
# print(avoids(word, x))
for line in fin:
    word = line.strip()
    if avoids(word, x):
        num = num + 1
print(num, 'words without', x)
# withtout: aeiou
# 107 words without aeiou
# count = 0
# import itertools
# for i in itertools.combinations('abcdefghijklmnopqrstuvwxyz', 5):
#     print(''.join(i))
#     count = count + 1
# print(count) # 65780个排列组合的可能性啊啊啊啊啊啊
 
 
def uses_only(word, x): # 第4小问
    for letter in word:
        if letter not in x:
            return False
    return True
word = input('word is ')
x = input('uses is ')
print(uses_only(word, x))
# word is abc
# uses is efg
# False
 
def uses_all(word, x): # 第5小问
    for letter in x:
        if letter not in word:
            return False
    return True
fin = open('words.txt')
x = input('must use: ' )
num = 0
for line in fin:
    word = line.strip()
    if uses_all(word, x):
        num = num + 1
print(num, 'words with', x)
# must use: aeiou
# 598 words with aeiou
# must use: aeiouy
# 42 words with aeiouy
 
def is_abecedarian(word): # 第6小问
    index = 1
    while index < len(word) - 1:
        if ord(word[index-1]) > ord(word[index]):
            return False
        index = index + 1
    return True
fin = open('words.txt')
num = 0
for line in fin:
    word = line.strip()
    if is_abecedarian(word):
        num = num + 1
print(num, 'words is abecedarian')
# 1573 words is abecedarian
2017-05
19

密密麻麻

By xrspook @ 8:40:39 归类于: 烂日记

去掉字幕文件里所有的时间标签,把句与句之间的回车换成空格,然后用小五的字体8磅的行距1.5厘米的页边在A4纸上打印出来。那是一个什么怪物?从远看这根本就不是一篇东西,而是一个图案。密密麻麻,能看得清,但是绝对不是方便人阅读的格式。即便是以这样的版式,我也打印了三页大半内容。如果我不把回车都去掉,那么最终出来的东西即便我分栏到6到7个,同样小五字体和8磅行距7页A4纸可能也包不住。虽然纸和打印机都不用我给钱,但我天生是个很抠门的人,别人的钱能省的时候还是得节约的。从整体感觉上来说这些字体的大小和密集程度还比不上牛津的高阶字典。而且这堆东西我不是用来阅读的,而是用来查找的,只有当我觉得某处有疑问的时候,我才需要去看。所以这堆文字的使用方式也跟字典差不多。有些我记忆深刻的部分,我根本不用去看原文到底是什么样,而且这堆所谓的原文也不过是外国人的一些不完整的翻译。使用的时候因为行距非常小,所以必备的工具当然是直尺。对上一次用直尺去阅读,不知道是什么时候的事了。即便是最终的官方英文字幕出来了,我也可能是用这种密集打印的方式。中文校对的时候可能我也会这样,因为要往下拉那些字幕文件实在太费劲了,太多的回车太多的时间标签,而那些对翻译本身来说没什么意义。只有当我真正要把东西从英文翻译成中文,我才会体会到自己的英文是多么的糟糕,自己的语文学得多么的不到位。即便不要求我用很文雅的方式,为什么我写出来的字句就是没有别人那么流畅而这种事跟我平时的阅读量有关,如果我的阅读量能保持在一定的水平,那么我写东西的时候也会通顺一点。但是这种阅读只能是看书,而不能是看一些快餐资料。快餐资料是一个浮夸的文化,要不给你打鸡血,要不给你喂鸡汤,用的语言都是有点刺激性的。相对于真正的朴实来说,那有点太激动太过了。如果一天到晚都只是看那些东西的话,人也会变得有点神经。

昨天早上我已经为我的六刷《摔跤吧!爸爸》买了星期天的电影票。昨天下午我也终于把星期六的也搞定了。两张电影票合计52块9。星期天那张是35块1,星期六那张是17块8。这两天的两个电影院差别将非常的巨大。星期六那个据说是海珠区电影院里面排名垫底的,而星期天那个用的全景声的放映厅,想想都知道会有点意思。记忆之中,我没有去过全景声的放映厅看电影。先是地狱,然后是天堂。这个体验顺序我觉得很可以。对我来说去看《摔跤吧!爸爸》除了是看电影本身以外,也是对电影院的考察。不同地方不同人群不同环境不同感受。我的同事终于忍不住要问我还要支持票房多少回。我的回答是只要电影还在上映,周末我就会去支持。但现在电影院看多了,如果想不走回头路,挑选会有点麻烦。纵观全广州的电影院在我附近而且愿意排片的,余下的实在没多少。而我最想的还是在万胜围万达的全景声或者巨幕厅看《摔跤吧!爸爸》,但同时我也明白这根本不会实现,因为他们绝对不会把这部电影排在他们最高级的放映厅。华谊跟万达是死敌,这是首要条件。其次是《摔跤吧!爸爸》只是一个普通的2D电影,虽然有全景声的声效。但可能对他们来说把这么一部电影排在巨幕厅是个浪费。如果市道不好,巨幕厅直接不安排电影。到现在为止,《摔跤吧!爸爸》可以这么说,已经在广州的几乎所有电影院里赢得了排片,但是我们最想要最高级的效果可能永远都无法实现。

人生总是不完美的,所以才给了我们空间不断去突破自己变得更完美。

2009-08
1

或许他永远都不知道

By xrspook @ 23:22:13 归类于: 烂日记

曾几何时我觉得在文字方面爸是万能的!他有很多很多匪夷所思的字典,他用各种各样的载体抄了无数页字典。无论我有什么字不会写他总能马上把某字写出来,如果我还需要进一步的考究他便能从不知哪本字典里把文字及其相关信息找出来。当同学们只能依靠字典的时候我身边有个活字典,这是爸爸最厉害的地方,也是我一直以来觉得他神圣的地方。从小学开始我就明白到除了《新华字典》、《现代汉语词典》以外原来还有很多很多比枕头还厚,印刷着比火星文更怪异文字的字典词典。

上周,要找个“馥”字,开始在Sogou输入法里看漏眼了没找到,于是我就在嚷嚷那个字到底怎么读(失败,不会五笔的失败!)。爸肯定那个发音是“fù”,然后就开始用《现代汉语词典》在找。从不知道什么时候开始用拼音来查字典我就不用翻前面的音节表了,而是直接翻后面。爸的方法和我一样,但他纯粹在乱翻,从翻的手法我就知道他26个英文字母的顺序一团糟,或许并不是一团糟,但起码在那天的那个情况下不辨东南西北。从前的字典怪人如今在我看来原来还不过如此,我觉得理所当然的东西在他手里仿佛是未知的技巧,又或许那不叫未知,而是因为他从来对技术性的东西不闻不问、置若罔闻。我一直觉得爸是一个学者,但以这样的态度去研究我觉得不行,如果不能吸收先进的东西结合学问一同发展,学者只会被称为“书呆子”。

今天,他叫我帮他找一个叫做《汉字部首表》的东西。他的“证据”来源于广州日报某天的一小段文字(没有日期,没有出处,就是一小纸片,我们可以称之为“剪报”,但以现在xrspook的眼光看来,那个东西的可溯源性很弱,应该规矩地把它粘在某个本子里,写上出处和时间,这是重新检索到该文章的关键信息)。拿“汉字部首表”去请教G老师,我的天,出来的居然是一大堆新闻,而且都是互相抄袭的那种,要找到最初的出处的话还得花很多力度和时间。我要的是那个在2009年5月1日要实行的《汉字部首表》啊!直觉告诉我,那应该是一个标准,一番周折后知道了原来是标准“GF 0011-2009 汉字部首表”(GF,规范!)是一个语言文字标准,同期实施,而且密切相关的还有个叫“GF 0012-2009 GB13000.1字符集汉字部首归部规范”的东西。前者10页,后者237页!可能没有经过压缩,2个PDF的单页都很庞大,1页大概占0.8MB。下载过很多标准,中文的、英文的,连西文、法文的都试过,它们有的是转制,有的是扫描图片,但从来没有试过“占地”那么大的。无奈~~~ 花了接近1个小时,终于把两个加起来近200MB的东西下载了回来,今天是8月的第一天啊,我就被逼使用超过1小时的上网时间了(30小时/月)。下载是个机械的操作,怎么下载,去哪里下载那就是技巧!

爸可能永远都不知道自己的女儿在这方面有点小聪明。他总是要找相对稀奇古怪的东西,但我却总能用方法KO掉,因为我沉醉过更莫名其妙的信息。就“汉字部首表”为例吧,如果拿那几个字拿去搜索的话可能找到的大多是新闻,但实际上你是要找相关的标准PDF啊!于是,这里的突破口就是要找到这个经常被砍掉一些东西的标准的标准号信息!那是唯一的,只有在提供精确图书购买或者下载的地方才会出现那些字眼,只在搜索里输入“GF 0011-2009”那些恼人的重复新闻就被过滤掉了,就在前几条信息里,我就找到了下载的链接。如果爸也懂电脑的话,他或许更能体会知识的博大精深,但可惜、无奈的是他连手机使用都拒绝,更不用说熟练准确地使用电脑和互联网了。爸不是独例,在他的同学中很多都似乎对现代科技嗤之以鼻,甚至有不会用手机、不需要用手机为荣的念头。作为一个旁观者,作为一个后辈,我清楚地明白到他们落伍的真正原因,不是因为他们年纪太大了,而是因为他们把自己封闭了!在他们的年代里,书是很先进的东西,但现在不一样了,书相对于其它传播介质来说更新速度太慢了,于是大学老师总会说书里写的都是经典,虽然它可能叫做“新技术”但实际上已经是“老技术”了。爸很依赖书,他不能接受新传播方式还是他拒绝接受新世界呢?

爸,他可能永远都不知道我知道了这些。什么时候他才知道他的女儿自从2004年6月9日起天天在blog呢?不知道,或许他永远都不知道……

© 2004 - 2024 我的天 | Theme by xrspook | Power by WordPress