问题描述
我目前有2个工作表,为简单起见,在说明中将它们称为 Sheet1
和 Sheet2
.在 Sheet1
中,我有大约5万行数据.我试图遍历 Sheet1
并在数据集中找到唯一的匹配项,然后转移到 Sheet2
.
I currently have 2 worksheets, for simplicity sake let's call them Sheet1
and Sheet2
in the explanations. In Sheet1
I have around 50k rows of data. I am trying to go through Sheet1
and find unique occurrences in the data set to then transfer across to Sheet2
.
以下是我到目前为止使用的方法及其对所用时间的粗略估计.
Below are the methods I have used so far and their rough estimates for time taken.
方法A-使用 For
循环使用 For
循环遍历 Sheet1
,如果满足条件,则在VBA中进行条件检查-将该行上8个单元格的范围传输到 Sheet2
.此方法在60分钟内完成60%.
Method A - Iterate through Sheet1
with a For
loop with the conditional check programmed in VBA, if condition is met - transfer a range of 8 cells on that row to Sheet2
. This method completes 60% in 60 minutes.
方法B-我认为在VBA中删除条件检查可以加快速度,因此我在 Sheet1
中创建了一个新列,该列带有返回"Y"的 IF
语句如果满足条件.然后,我遍历此列,如果有"Y",则将出现的事件转移到 Sheet2
中.奇怪的是,此方法比方法A花费更长的时间,即60分钟内达到50%.
Method B - I thought that removing the condition check in VBA could speed things up so I created a new column in Sheet1
with an IF
statement that returns "Y" if the condition is met. I then iterate through this column and if there is a "Y" - transfer the occurrence across to Sheet2
. This weirdly takes longer than method A, namely 50% in 60 mins.
Sub NewTTS()
Dim lRow1 As Long, lRow2 As Long
Dim i As Long
With wsOTS
lRow1 = .Range("E" & .Rows.Count).End(xlUp).Row
For i = lRow1 To 2 Step -1
If .Range("P" & i).Text = "Y" Then
lRow2 = wsTTS.Range("E" & wsTTS.Rows.Count).End(xlUp).Row + 1
wsTTS.Range("E" & lRow2, "AA" & lRow2).Value = .Range("E" & i, "AA" & i).Value
End If
Next i
End With
End Sub
方法C-然后,我在另一篇文章中读到 .Find()
方法比使用 For
循环方法更快.因此,我在返回"Y"的列中使用了 .Find()
,然后将事件转移到 Sheet2
.这是迄今为止最快的方法,但仍只能在60分钟内完成75%.
Method C - I then read on another post that the .Find()
method is quicker than using For
loop method. As such I used a .Find()
in the column that returns the "Y" and then transfer event across to Sheet2
. This is the fastest method so far but still only completes 75% in 60 mins.
Sub SearchOTS()
Application.ScreenUpdating = False
Dim startNumber As Long
Dim lRow1 As Long, lRow2 As Long
Dim i As Long
Dim startTime As Double
startTime = Time
lRow1 = wsOTS.Range("E" & wsOTS.Rows.Count).End(xlUp).Row
Columns("P:P").Select
Selection.Find(What:="Y", After:=ActiveCell, LookIn:=xlValues, LookAt:= _
xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False _
, SearchFormat:=False).Activate
startNumber = ActiveCell.Row
lRow2 = wsTTS.Range("E" & wsTTS.Rows.Count).End(xlUp).Row + 1
wsTTS.Range("E" & lRow2, "AA" & lRow2).Value = wsOTS.Range("E" & ActiveCell.Row, "AA" & ActiveCell.Row).Value
For i = 1 To lRow1
Selection.FindNext(After:=ActiveCell).Activate
If ActiveCell.Row = startNumber Then GoTo ProcessComplete
lRow2 = wsTTS.Range("E" & wsTTS.Rows.Count).End(xlUp).Row + 1
wsTTS.Range("E" & lRow2, "AA" & lRow2).Value = wsOTS.Range("E" & ActiveCell.Row, "AA" & ActiveCell.Row).Value
wsOTS.Range("B18").Value = i / lRow1
Next i
ProcessComplete:
Application.ScreenUpdating = True
MsgBox "Complete! Time taken: " & Format(Time - startTime, "hh:mm:ss")
End Sub
方法D-然后我读了另一篇文章,说最快的方法是建立一个数组,然后遍历该数组.我使用一个集合(动态的)来代替数组,然后遍历 Sheet1
并存储发生的行号.然后,我遍历集合并将事件转移到 Sheet2
中.此方法在60分钟内返回50%.
Method D - I then read another post saying that the fastest way would be to build an array and then loop through the array. Instead of an array I used a collection (dynamic), and I iterate through Sheet1
and store the row numbers for the occurences. I then loop through the collection and transfer the events across to Sheet2
. This method returns 50% in 60 mins.
Sub PleaseWork()
Dim i As Long
Dim lRow1 As Long, lRow2 As Long
Dim myCol As New Collection
lRow1 = wsOTS.Range("E" & wsOTS.Rows.Count).End(xlUp).Row
'build collection of row numbers
For i = 1 To lRow1
If wsOTS.Range("P" & i).Text = "Y" Then
myCol.Add i
End If
Next i
'now go through collection and build TTS
For i = 1 To myCol.Count
lRow2 = wsTTS.Range("E" & wsTTS.Rows.Count).End(xlUp).Row + 1
wsTTS.Range("E" & lRow2, "N" & lRow2).Value = wsOTS.Range("E" & myCol(i), "N" & myCol(i)).Value
Next i
Set myCol = New Collection
End Sub
我正在尝试找到最快的方法来完成此任务,但是我尝试过的所有方法都需要一个多小时才能完成.
I am trying to find the fastest way to complete this task but all the methods I have tried are yielding greater than an hour to complete.
这里有什么我想念的吗?有更快的方法吗?
Is there anything I am missing here? Is there a faster method?
推荐答案
访问范围非常慢,并且导致运行时间长.如果您已经知道要读取1000行,则不要一次读取它们.而是将整个范围拉入缓冲区,然后仅使用该缓冲区.写作也一样.如果您事先不知道要写多少,请写一些例如长度为100行.
Accessing a range is abysmally slow, and the cause for your long runtime. If you already know that you are going to read 1000 rows, do not read them one at a time. Instead, pull the whole range in a buffer, then work only with that buffer. Same goes for writing. If you do not know in advance how much you will write, make chunks of e.g. 100 rows length.
()示例:
Sub PleaseWork()
Dim i As Long, j as long
Dim lRow1 As Long, lRow2 As Long
Dim myCol As New Collection
Dim column_p() as variant
dim inbuffer() as Variant
dim outbuffer() as variant
lRow1 = wsOTS.Range("E" & wsOTS.Rows.Count).End(xlUp).Row
' Get whole Column P at once
column_p = wsOTS.Range("P1").Resize(lRow1, 1).Value
'build collection of row numbers
For i = 1 To lRow1
If column_p(i, 1) = "Y" Then
myCol.Add i
End If
Next i
'now go through collection and build TTS
lRow2 = myCol.Count 'Number of required rows
' get whole input range
inbuffer = wsOTS.Range("E1").Resize(lRow1, 10).Value
' prepare output
ReDim outbuffer(1 to lRow2, 1 to 10)
For i = 1 To myCol.Count
' write into outbuffer
for j = 1 to 10
outbuffer(i, j) = inbuffer(myCol(i), j)
Next
Next i
' Set whole output at once
wsTTS.Range("E1").Resize(lRow2, 10).Value = outbuffer
Set myCol = New Collection
End Sub
这篇关于在工作表之间传输大量数据的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!