2023-11
1

不闪烁的跨表复制

By xrspook @ 8:00:17 归类于: 烂日记

昨天说到当我想用VBA把一个工作簿里面的某些工作表复制到另外一个工作簿的时候,如果要复制的工作表有多个,复制过程中就会出现闪烁。如果需要复制的工作表很多,闪烁的时间会比较长。我觉得批量选择,最后一次性复制是一个解决办法。如果不是用VBA,而是手动实现这个复制功能,的确会用批量选择,然后再复制。现在按住Ctrl,然后用鼠标选择需要复制的工作表,最后进行复制。如果要在VBA里面实现这个批量选择,该怎么做呢?

一个工作簿里的工作表该如何判断哪些是需要的,哪些是不需要的,对我来说只需要判定工作表的名称就可以,所以我可以对需要的工作表以某种方式结合起来,比如形成一个以固定分隔符的字符串。VBA里要表达批量选择,然后复制,到底该用什么东西呢?这一次我没有搜索,而是直接录制了个宏。结果发现原来批量选择再复制,在宏里默认是通过数组的。需要把目标工作表的名称组合成一个一维数组,然后用最普通的复制方式把拷贝到新的工作簿。所以我要做的就是把目标工作表的名称赋值给一个一维数组。因为不知道到底要复制多少个工作表,所以这个数组应该是一个动态数组,我马上想到的方式就是以默认分隔符打断字符串的方式赋值给一维的动态数组。这个操作非常高效,而且动态数组的大小完全不需要在一开始就定义好,字符串里面有多少东西就可以生成多大的一维数组。

接下来,我要做的是经过一轮循环筛选出某个工作簿里我需要的工作表的名称,然后把这些名称以固定分隔符的方式连接起来。最后在打断之前我还要干掉一个分隔符,因为无论我把分隔符放在最前面还是最后面,整个字符串还是会多一个分格符。我不知道多一个分隔符在赋值给一维数组的时候会不会失败,作为完美主义者,我宁愿手动把那个分隔符去掉。那不过是一个字符串取值的简单操作而已。处理好字符串以后,就可以打断那个字符串赋值给一维数组,最后通过经典的复制方式,把工作簿里的目标工作表拷贝到另外一个工作簿。

我没有测试过这样处理耗时跟基础经典处理有多大差别,但就肉眼来说,差别挺大。因为如果进行这般批量选择一次性复制,基本上可以秒杀完成,但如果使用经典的方法、工作表又比较多,我感觉得闪烁一秒钟以上。秒杀就能完成我感觉那个耗时会在0.3秒以内。

批量选择再复制的确可以秒杀实现拷贝工作表,但是光标会停留在最后一个被复制的工作表里,我需要光标仍然停留在我的点击VBA控件的工作表。所以最后我又增加了一个激活目标工作表,把鼠标锁定在某个单元格的功能。这个功能不是非如此不可的,没有也不影响数据处理本身,但就方便程度而言,这是显而易见的。就如没有初始化和拷贝数据那两个模块,核心功能没问题,但是多了这些贴心的小步骤。整个核对的过程会变得更顺手。

如果使用的时候能更顺畅,我宁愿把代码搞得复杂一些。毕竟复杂的代码是一次性的。

PS:测试
复制19个工作表:批量法0.85秒,经典法7.25秒。
复制8个工作表:批量法0.59秒,经典法3.02秒。
复制4个工作表:批量法0.48秒,经典法1.59秒。

Sub 批量拷贝工作表-批量法()
    Application.ScreenUpdating = False '关闭屏幕刷新
    Dim book As Workbook
    Dim sht As Worksheet
    Dim my_name As String, stringArray() As String
 
    my_name = ""
 
    For Each book In Workbooks
        If book.Name <> "AAA.xlsm" Then
            For Each sht In book.Worksheets
                If Not sht.Name Like "*BBB*" Then
                    my_name = my_name & sht.Name & ","
                End If
            Next
            my_name = Mid(my_name, 1, Len(my_name) - 1)
            stringArray = Split(my_name, ",")
            book.Sheets(stringArray).Copy before:=ThisWorkbook.Worksheets("条件")
        End If
    Next
 
    With ThisWorkbook
        .Worksheets("条件").Activate
        .Worksheets("条件").Range("A27").Select
    End With
 
    MsgBox "完成数据抓取!"
    Application.ScreenUpdating = True '打开屏幕刷新
End Sub
 
************************************
 
Sub 批量拷贝工作表-经典法()
    Application.ScreenUpdating = False '关闭屏幕刷新
 
    Dim book As Workbook
    Dim sht As Worksheet
    For Each book In Workbooks
        If book.Name <> "AAA.xlsm" Then
            For Each sht In book.Worksheets
                If Not sht.Name Like "*BBB*" Then
                    sht.Copy before:=ThisWorkbook.Worksheets("条件")
                End If
            Next
        End If
    Next
 
    With ThisWorkbook
        .Worksheets("条件").Activate
        .Worksheets("条件").Range("A27").Select
    End With
 
    MsgBox "完成数据抓取!"
    Application.ScreenUpdating = True '打开屏幕刷新
End Sub
2020-04
26

算算书里有多少单词

By xrspook @ 18:12:57 归类于: 扮IT

算算书里有多少单词应该是很大路简单的事,但实际上各种状况层出不穷。有些是你料到的,比如排版的用了全角的标点符号,程序默认会删掉标点符号,万一排版那个没有规范地使用空格呢?有些是你不会料到的,比如手误创造出奇葩字符串。很早以前我就发现Notepad++和Word里算的字数是不一致的,Notepad++通常算出来的数都会大一些。谁对谁错,随缘吧,知道大概差不多也就行了,毕竟高考的时候你写少几个字不到800也不会真扣你的分。

字典和列表的相爱相杀我体会得越来越深刻了。

words.txt在这里,emma.txt在这里。

Exercise 1: Write a program that reads a file, breaks each line into words, strips whitespace and punctuation from the words, and converts them to lowercase. Hint: The string module provides a string named whitespace, which contains space, tab, newline, etc., and punctuation which contains the punctuation characters. Let’s see if we can make Python swear:
>>> import string
>>> string.punctuation
‘!”#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~’
Also, you might consider using the string methods strip, replace and translate.

Exercise 2: Go to Project Gutenberg (http://gutenberg.org) and download your favorite out-of-copyright book in plain text format. Modify your program from the previous exercise to read the book you downloaded, skip over the header information at the beginning of the file, and process the rest of the words as before. Then modify the program to count the total number of words in the book, and the number of times each word is used. Print the number of different words used in the book. Compare different books by different authors, written in different eras. Which author uses the most extensive vocabulary?

Exercise 3: Modify the program from the previous exercise to print the 20 most frequently used words in the book.

Exercise 4: Modify the previous program to read a word list (see Section 9.1) and then print all the words in the book that are not in the word list. How many of them are typos? How many of them are common words that should be in the word list, and how many of them are really obscure?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
import string
fin = open('words.txt')
mydict = {}
for line in fin:
    word = line.strip()
    mydict[word] = ''
file = open('emma.txt', encoding = 'utf-8')
essay = file.read().lower()
essay = essay.replace('-', ' ')
pun = {}
str_all = '“' + '”' + string.punctuation
for x in str_all: # 建立各种标点符号字符的字典
    pun[x] = ''
useless = essay.maketrans(pun) # maketrans必须被替换和替换等长,字典完美解决这个问题
l = essay.translate(useless).split() # 那些含-的单词会死得很惨,但仍然算是个单词
print('this book has', len(l), 'words')
book = {}
for item in l: # 读取文件为字符串,字符串转为单词列表,列表转为计数的字典,单词为键,次数为键值
    book[item] = book.get(item, 0) + 1
list_words1 = sorted(list(zip(book.values(), book.keys())), reverse = True) # 字典转为列表,键与键值换位
print('this book has', len(list_words1), 'different words')
print('times', 'word', sep='\t')
count = 1
word_len = 0 # 限制最小词长
for times, word in list_words1: # 打印大于某长度用得最多的20个词(不限制,3个字母及以下最最简单的会刷屏)
    if len(word) > word_len:
        print(times, word, sep='\t')
        count += 1
    if count > 20:
        break
count = 0
for word in book:
    if word not in mydict:
        # print(word, end=' ')
        count += 1
print(count, 'words in book not in dict') # 结果惨不忍睹,合计590个
# this book has 164065 words
# this book has 7479 different words
# times   word
# 5379    the
# 5322    to
# 4965    and
# 4412    of
# 3191    i
# 3187    a
# 2544    it
# 2483    her
# 2401    was
# 2365    she
# 2246    in
# 2172    not
# 2069    you
# 1995    be
# 1815    that
# 1813    he
# 1626    had
# 1448    as
# 1446    but
# 1373    for
# 590 words in book not in dict
# -----------------------------解法二----------------------------- 其实就是切单词方法有差异
import string
def set_book(fin1):
    useless = string.punctuation + string.whitespace + '“' + '”'
    d = {}
    for line in fin1:
        line = line.replace('-', ' ')
        for word in line.split():
            word = word.strip(useless)
            word = word.lower()
            d[word] = d.get(word, 0) + 1
    return d
def set_dict(fin2):
    d = {}
    for line in fin2:
        word = line.strip()
        d[word] = d.get(word, 0) + 1
    return d
fin1 = open('emma.txt', encoding='utf-8')
fin2 = open('words.txt')
book = set_book(fin1)
mydict = set_dict(fin2)
l = sorted(list(zip(book.values(), book.keys())), reverse=True)
count = 0
for key in book:
    count = count + book[key]
print('this book has', count, 'words')
print('this book has', len(book), 'different words')
num = 20
print(num, 'most common words in this book')
print('times', 'word', sep='\t')
for times, word in l:
    print(times, word, sep='\t')
    num -= 1
    if num < 1:
        break
count = 0
for word in book:
    if word not in mydict:
        # print(word, end=' ')
        count += 1
# print()
print(count, 'words in book not in dict')
# this book has 164120 words
# this book has 7531 different words
# 20 most common words in this book
# times   word
# 5379    the
# 5322    to
# 4965    and
# 4412    of
# 3191    i
# 3187    a
# 2544    it
# 2483    her
# 2401    was
# 2364    she
# 2246    in
# 2172    not
# 2069    you
# 1995    be
# 1815    that
# 1813    he
# 1626    had
# 1448    as
# 1446    but
# 1373    for
# 683 words in book not in dict
2020-04
22

字典还能这样玩!

By xrspook @ 18:30:55 归类于: 扮IT

一开始,我自己写的脚本能运行,但慢到怀疑人生。吃了个饭,折腾了半个小时后,字母表才处理到b而已,显然这是个失败的操作。我的做法是常规地为词汇表建立字典,然后历遍字典里的每个单词,单词进入函数后跟字典的另一个单词比较,比较方法是把单词(即字符串)打散为字符列表然后排列,如果排列一致,且被比较的单词小于拿去比的单词,它们就是一伙的,贴在被比较的单词列表下。列表长度大于2就返回列表然后打印。这样是可以选出异构词的,但非常非常慢!

看过参考答案之后我跳起来了,他们用了一句”.join(lists),这等于是把列表str重新粘成一个字符串,我那个去!他们把单词用列表打散重排再粘回去,最关键的是,这个唯一的重排字符串他们在建立字典的时候就作为key,所有与之有一样字符的全部被看作小弟被放置这个键的键值里。字典还是字典,但字典的键成了规则字符串,键值则是排列组合过的词汇表。我根本没想到啊,怎么可能想得到呢!!!!!

题目要求倒序打印,然后要求找出能组成最多异构词的8个字母。但实际上参考答案的输出问非所答,比如没有倒序,比如只是把8个字母的异构词摆出来,没确切告诉你最多的是什么。

Exercise 2: More anagrams! Write a program that reads a word list from a file (see Section 9.1) and prints all the sets of words that are anagrams. Here is an example of what the output might look like:
[‘deltas’, ‘desalt’, ‘lasted’, ‘salted’, ‘slated’, ‘staled’]
[‘retainers’, ‘ternaries’]
[‘generating’, ‘greatening’]
[‘resmelts’, ‘smelters’, ‘termless’]
Hint: you might want to build a dictionary that maps from a collection of letters to a list of words that can be spelled with those letters. The question is, how can you represent the collection of letters in a way that can be used as a key? Modify the previous program so that it prints the longest list of anagrams first, followed by the second longest, and so on. In Scrabble a “bingo” is when you play all seven tiles in your rack, along with a letter on the board, to form an eight-letter word. What collection of 8 letters forms the most possible bingos? Solution: http://thinkpython2.com/code/anagram_sets.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from time import time
def sorted_anagram(d):
    l = []
    for key in d:
        if len(d[key]) > 1:
            l.append((len(d[key]), d[key])) # 这是个由列表创建的元组?
    return sorted(l, reverse = True) # 倒序神马真折腾
def eight_letters(d, num):
    global length # 全局变量都用上了,就为了记录个最大值
    new_l = []
    for key in d:
        if len(key) == num and len(d[key]) > 1:
           new_l.append((len(d[key]), d[key]))
           if len(d[key]) >= length:
               length = len(d[key])
    return sorted(new_l)
def sorted_letters(word):
    list_word = sorted(list(word)) # 先把字符串打散为字符列表,然后排序
    reword =''.join(list_word) # 再把字符列表回粘成字符串
    return reword
def set_dict(fin):
    d = {}
    for line in fin:
        word = line.strip()
        reword = sorted_letters(word) # 打散重排相当关键,必须在建立字典时就做!!!
        if reword not in d:
            d[reword] = [word] # 字典的键已经不是单词,是纯粹的规律字符串
        else:
            d[reword].append(word) # 字典的键值才是词汇表里的单词
    return d
fin = open('words.txt')
length = 0
count = 0
start = time()
d = set_dict(fin)
for item in sorted_anagram(d):
    print(item)
    count += 1
print(count)
for item in eight_letters(d, 8):
    if item[0] == length:
        print(item)
end = time()
print(end - start)
# ......
# (2, ['abacas', 'casaba'])
# (2, ['aba', 'baa'])
# (2, ['aals', 'alas'])
# (2, ['aal', 'ala'])
# (2, ['aahed', 'ahead'])
# (2, ['aah', 'aha'])
# 10157 # 全体异构词
# (7, ['angriest', 'astringe', 'ganister', 'gantries', 'granites', 'ingrates', 'rangiest'])
# 异构词最多的8字母单词(共7个异构词)
# 0.6079998016357422
2020-04
21

找2个词汇表

By xrspook @ 17:59:37 归类于: 扮IT

最根本的在于做好词语的切片和组合,知道要找什么词,丢到字典里找那是秒杀的事。我不知道他们为什么这么折腾,直接在有音标的词汇表里找不就得了,但偏偏要两个词汇表都要求同时存在,于是循环判断神马不得不多一层。幸好字典里用in那是非常简单快捷的事。

Exercise 6: Here’s another Puzzler from Car Talk (http://www.cartalk.com/content/puzzlers): This was sent in by a fellow named Dan O’Leary. He came upon a common one-syllable, five-letter word recently that has the following unique property. When you remove the first letter, the remaining letters form a homophone of the original word, that is a word that sounds exactly the same. Replace the first letter, that is, put it back and remove the second letter and the result is yet another homophone of the original word. And the question is, what’s the word? Now I’m going to give you an example that doesn’t work. Let’s look at the five-letter word, ‘wrack.’ W-R-A-C-K, you know like to ‘wrack with pain.’ If I remove the first letter, I am left with a four-letter word, ’R-A-C-K.’ As in, ‘Holy cow, did you see the rack on that buck! It must have been a nine-pointer!’ It’s a perfect homophone. If you put the ‘w’ back, and remove the ‘r,’ instead, you’re left with the word, ‘wack,’ which is a real word, it’s just not a homophone of the other two words. But there is, however, at least one word that Dan and we know of, which will yield two homophones if you remove either of the first two letters to make two, new four-letter words. The question is, what’s the word? You can use the dictionary from Exercise 1 to check whether a string is in the word list. To check whether two words are homophones, you can use the CMU Pronouncing Dictionary. You can download it from http://www.speech.cs.cmu.edu/cgi-bin/cmudict or from http://thinkpython2.com/code/c06d and you can also download http://thinkpython2.com/code/pronounce.py, which provides a function named read_dictionary that reads the pronouncing dictionary and returns a Python dictionary that maps from each word to a string that describes its primary pronunciation. Write a program that lists all the words that solve the Puzzler. Solution: http://thinkpython2.com/code/homophone.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from pronounce import read_dictionary
from time import time
def set_dict(fin):
    d ={}
    for line in fin:
        word = line.strip()
        d[word] = 0
    return d
def same_pronounce(a, b, c, mydict, pdict):
    if b in mydict and c in mydict:
        if b in pdict and c in pdict:
            if pdict[a] == pdict[b] and pdict[a] == pdict[c]:
                return True
    return False
start = time()
count = 0
pdict = read_dictionary()
fin = open('words.txt')
mydict = set_dict(fin)
for word in mydict:
    if word in pdict:
        word1 = word[1:]
        word2 = word[:1] + word[2:]
        if same_pronounce(word, word1, word2, mydict, pdict):
            print(word, word1, word2)
            count += 1
print(count)
end = time()
print(end - start)
# llama lama lama
# llamas lamas lamas
# scent cent sent
# 3
# 0.28800010681152344
2020-04
15

反正这是我的答案

By xrspook @ 19:44:55 归类于: 扮IT

题目摆在这里,没有确切的答案,下面是我的解答,对不对不知道。words.txt资源在这里。

There are solutions to these exercises in the next section. You should at least attempt each one before you read the solutions.

Exercise 1: Write a program that reads words.txt and prints only the words with more than 20 characters (not counting whitespace).

Exercise 2: In 1939 Ernest Vincent Wright published a 50,000 word novel called Gadsby that does not contain the letter “e”. Since “e” is the most common letter in English, that’s not easy to do. In fact, it is difficult to construct a solitary thought without using that most common symbol. It is slow going at first, but with caution and hours of training you can gradually gain facility. All right, I’ll stop now. Write a function called has_no_e that returns True if the given word doesn’t have the letter “e” in it. Write a program that reads words.txt and prints only the words that have no “e”. Compute the percentage of words in the list that have no “e”.

Exercise 3: Write a function named avoids that takes a word and a string of forbidden letters, and that returns True if the word doesn’t use any of the forbidden letters. Write a program that prompts the user to enter a string of forbidden letters and then prints the number of words that don’t contain any of them. Can you find a combination of 5 forbidden letters that excludes the smallest number of words?

Exercise 4: Write a function named uses_only that takes a word and a string of letters, and that returns True if the word contains only letters in the list. Can you make a sentence using only the letters acefhlo? Other than “Hoe alfalfa”?

Exercise 5: Write a function named uses_all that takes a word and a string of required letters, and that returns True if the word uses all the required letters at least once. How many words are there that use all the vowels aeiou? How about aeiouy?

Exercise 6: Write a function called is_abecedarian that returns True if the letters in a word appear in alphabetical order (double letters are ok). How many abecedarian words are there?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
fin = open('words.txt') # 第1小问
for line in fin:
    if len(line) >= (20+2):
        word = line.strip()
        print(word)
# counterdemonstrations
# hyperaggressivenesses
# microminiaturizations
 
def has_no_e(word): # 第2小问
    for letter in word:
        if letter == 'e':
            return False
    return True
fin = open('words.txt')
all = 0
count = 0
for line in fin:
    word = line.strip()
    all = all + 1
    if has_no_e(word):
        print(word)
        count = count + 1
print(count, 'words without e')
print('{:.0%}'.format(count/all), 'words without e')
# ...
# zymosis
# zymotic
# zymurgy
# 37641 words without e
# 33% words without e
 
def avoids(word, x): # 第3小问,最后一个问题举手投降
    for letterw in word:
        for letterx in x:
            if letterw == letterx:
                return False
    return True
fin = open('words.txt')
x = input('withtout: ')
num = 0
# word = 'jwrojgre' # input('word is ')
# print(avoids(word, x))
for line in fin:
    word = line.strip()
    if avoids(word, x):
        num = num + 1
print(num, 'words without', x)
# withtout: aeiou
# 107 words without aeiou
# count = 0
# import itertools
# for i in itertools.combinations('abcdefghijklmnopqrstuvwxyz', 5):
#     print(''.join(i))
#     count = count + 1
# print(count) # 65780个排列组合的可能性啊啊啊啊啊啊
 
 
def uses_only(word, x): # 第4小问
    for letter in word:
        if letter not in x:
            return False
    return True
word = input('word is ')
x = input('uses is ')
print(uses_only(word, x))
# word is abc
# uses is efg
# False
 
def uses_all(word, x): # 第5小问
    for letter in x:
        if letter not in word:
            return False
    return True
fin = open('words.txt')
x = input('must use: ' )
num = 0
for line in fin:
    word = line.strip()
    if uses_all(word, x):
        num = num + 1
print(num, 'words with', x)
# must use: aeiou
# 598 words with aeiou
# must use: aeiouy
# 42 words with aeiouy
 
def is_abecedarian(word): # 第6小问
    index = 1
    while index < len(word) - 1:
        if ord(word[index-1]) > ord(word[index]):
            return False
        index = index + 1
    return True
fin = open('words.txt')
num = 0
for line in fin:
    word = line.strip()
    if is_abecedarian(word):
        num = num + 1
print(num, 'words is abecedarian')
# 1573 words is abecedarian
© 2004 - 2024 我的天 | Theme by xrspook | Power by WordPress