Home » Handling pairs of pattern matching in multiple excel files through VB macros

Handling pairs of pattern matching in multiple excel files through VB macros


my assumptions:

Excel_In1.xls == catalog.xlsx (table has unique key pair columnA & columnB in each row)

Excel_In2.xls == factdata.xlsx (table has multiple duplicate key pairs; and contains data in fixed columns to be copied)

Excel_Out.xls == book Out.xlsm

Option Explicit

Private Type TState
    catalog As Object
    selectedData As Object
End Type

Private this As TState

Public Sub ExampleSubMain()
End Sub

Private Sub Init()
    Set this.selectedData = CreateObject("Scripting.Dictionary")
End Sub

Private Sub InitCatalogDictionary()
    MakeTheBookOpened "D:vbasomefolder", "catalog.xlsx"
    Dim wb As Workbook
    Set wb = Workbooks("catalog.xlsx")
    Dim dataRange As Range
    Set dataRange = wb.Worksheets("catalogSheet").Range("a2:b10") 'for example "a2:b10"

    Set this.catalog = MakeDict(dataRange)
End Sub

Private Function MakeDict(ByVal dataRange As Range) As Object
    Dim result As Object
    Set result = CreateObject("Scripting.Dictionary")
    Dim row As Range
    For Each row In dataRange.Rows
        'asumes column A,B are true keys and their pairs are unique, value = empty string
        result.Add Join(Array(row.Cells(1), row.Cells(2))), ""
    Next row
    Set MakeDict = result
End Function

Private Sub MakeTheBookOpened(ByVal pathWithSeparator As String, _
                              ByVal wbName As String)
    If TheBookIsOpenedAlready(wbName) Then Exit Sub
    Workbooks.Open Filename:=pathWithSeparator & wbName, ReadOnly:=True
End Sub

Private Function TheBookIsOpenedAlready(ByVal Name As String) As Boolean
    Dim wb As Workbook
    For Each wb In Workbooks
        If wb.Name = Name Then TheBookIsOpenedAlready = True: Exit Function
    Next wb
End Function

Private Sub PickData()
    MakeTheBookOpened "D:vbasomefolder", "factdata.xlsx"
    Dim wb As Workbook
    Set wb = Workbooks("factdata.xlsx")
    Dim dataRange As Range
    Set dataRange = wb.Worksheets("factSheet").Range("a2:k10") 'for example "a2:k10"
    Dim row As Range
    For Each row In dataRange.Rows
        Dim key As String
        key = Join(Array(row.Cells(4), row.Cells(6))) 'asumes product123, companyABC columns are there
        If this.catalog.Exists(key) Then this.selectedData.Add row, ""
    Next row
End Sub

Private Sub ExecuteCopying()
    If this.selectedData.Count = 0 Then Exit Sub
    Dim rowsNum As Long
    rowsNum = this.selectedData.Count
    Dim columnsNum As Long
    columnsNum = 3                               'for example 3
    Dim resultArr As Variant
    ReDim resultArr(1 To rowsNum, 1 To columnsNum)
    Dim pos As Long
    pos = 1
    Dim item As Variant
    For Each item In this.selectedData
        Dim row As Range
        Set row = item
        resultArr(pos, 1) = row.Cells(2)         'B
        resultArr(pos, 2) = row.Cells(7)         'G
        resultArr(pos, 3) = row.Cells(10)        'J
        pos = pos + 1
    Next item
    'book Out.xlsm
    ThisWorkbook.Worksheets(1).Range("a1").Resize(rowsNum, columnsNum) = resultArr
End Sub

Given two basic input files roughly matching your description:

enter image description here


enter image description here

And assuming the macro would reside in the out file, we could construct a multi-function macro to accomplish this with a few steps.

The first part of the macro knows filenames and parameters. You used column L in the first input file, but let’s make that configurable. The same with most of the other parameters, like the first line to start on so our input files can have headers.

Second, we need to open the first input file and read the keywords. There’s several ways to do this, but a very simple way to do it is to do a plan CSV line, so that from the first file, you can extract your “keywords” (your term): product123,product456. This can then be iterated over with a For Each loop through the second file.

In the second file, a very simple construct would be to loop over all entries. Depending on your needs, you may need to iterate through the second file only once if it is prohibitively large. Both of these function assume the first blank line terminates the input. If the row in the 2nd input file matches your target, you will perform your copy.

Finally, the copy also takes a CSV line for which columns to keep (keeping it configurable). Each column, as well as the first keyword, will be copied to the target worksheet, starting at row 1 with a configurable column start.

The final output in the output sheet looks something like this:

enter image description here

The output starts in the second column because that was what was specified in the configuration.

There may be more elegant approaches, but this is a straight-forward approach to it.

Const EXCEL_1 As String = "Excel1.xls"
Const EXCEL_1_KW_COL As String = "A"
Const EXCEL_2 As String = "Excel2.xls"
Const EXCEL_2_KW_COL As String = "A"
Const EXCEL_2_COPY_COLS As String = "B,E,G"
Const EXCEL_3 As String = "Excel3.xls"
Const EXCEL_3_TARGET As String = "B"

Public Function LoadInformation3()
  Dim Location As String, Keywords As String
  Application.ScreenUpdating = False
  Location = Application.ActiveWorkbook.Path
  Keywords = LoadKeywords(Location & EXCEL_1, EXCEL_1_KW_COL)
  Debug.Print "Keys=" & Keywords
  Dim L, CurrentDestRow As Long
  For Each L In Split(Keywords, ",")
    SearchKeywordAndCopy CurrentDestRow, Location & EXCEL_2, Location & EXCEL3, L, EXCEL_2_KW_COL, EXCEL_2_COPY_COLS, EXCEL_3_TARGET
  Application.ScreenUpdating = True
End Function

Public Function LoadKeywords(ByVal File As String, ByVal ColumnId As String, Optional ByVal FirstRow As Long = 2)
  Dim Wb1 As Workbook
  Dim Value As String, N As Long
  Set Wb1 = Workbooks.Open(File)
  N = FirstRow
  LoadKeywords = ""
  Do While True
    Value = Wb1.Sheets(1).Range(ColumnId & N).Text
    If Value = "" Then Exit Do
    LoadKeywords = LoadKeywords & IIf(LoadKeywords = "", "", ",") & Value
    N = N + 1
  Wb1.Close SaveChanges:=False
End Function

Public Sub SearchKeywordAndCopy(ByRef CurrentDestRow As Long, ByVal FileSource As String, ByVal FileTarget As String, ByVal Keyword As String, ByVal SourceColumn As String, ByVal SourceCopyFrom As String, ByVal DestCopyTo As String)
  Dim WbSource As Workbook, WbDest As Workbook
  Dim Value As String, N As Long
  Set WbDest = Application.ActiveWorkbook
  Set WbSource = Workbooks.Open(FileSource)
  N = 2
  Do While True
    Value = WbSource.Sheets(1).Range(SourceColumn & N).Text
    If Value = "" Then Exit Do
    If Value <> Keyword Then GoTo NextRow
    Dim L, M As Long
    CurrentDestRow = CurrentDestRow + 1
    WbDest.Sheets(1).Range(DestCopyTo & CurrentDestRow).Value = Keyword
    M = 0
    For Each L In Split(SourceCopyFrom, ",")
      Dim CopyValue As String
      CopyValue = WbSource.Sheets(1).Range(L & N).Text
      M = M + 1
      WbDest.Sheets(1).Range(DestCopyTo & CurrentDestRow).Offset(, M).Value = CopyValue
    N = N + 1
  WbSource.Close SaveChanges:=False
End Sub

Your setup as best I could understand it:



And… This is the code I wrote:

Option Explicit
Option Base 1
Sub CopyData()
    Dim XLout As Workbook   'Excel_Out.xls
    Dim XLin1 As Workbook   'Excel_In1.xls
    Dim XLin2 As Workbook   'Excel_In2.xls
    Dim ProductList         'Product/Company List from XLin1
    Dim ProductListO()      'Concatenated Version of above
    Dim DataList            'Product/Company List from XLin2
    Dim DataXcol            'Extra Data to pull from Column X in XLin2
    Dim DataZcol            'Extra Data to pull from Column Z in XLin2
    Dim Output()            'Output Array for XLout
    Dim i As Long           'Iterations
    Dim counter As Long     'Item number
    Dim TimeCount
    TimeCount = Timer
        ' >>> All Workbooks
    Set XLout = ThisWorkbook
    Set XLin1 = Workbooks.Open("C:UsersccritchlowDocumentsATestExcel_In1.xls")
    Set XLin2 = Workbooks.Open("C:UsersccritchlowDocumentsATestExcel_In2.xls")
        ' >>> Store Source Data in Arrays
    With XLin2.Sheets(1)
        DataList = .Range("A2:B" & .Range("A" & Rows.Count).End(xlUp).Row)
        DataXcol = .Range("X2:X" & .Range("A" & Rows.Count).End(xlUp).Row)
        DataZcol = .Range("Z2:Z" & .Range("A" & Rows.Count).End(xlUp).Row)
    End With
        ' >>> Store Product List Data in Arrays
    ProductList = XLin1.Sheets(1).Range("L2:M" & XLin1.Sheets(1).Range("M" & Rows.Count).End(xlUp).Row)
    ReDim ProductListO(1 To UBound(ProductList, 1))
    For i = 1 To UBound(ProductList, 1)
        ProductListO(i) = ProductList(i, 1) & "-" & ProductList(i, 2)
    Next i
        ' >>> Move entries from XLin2 (that exist on XLin1) into "Output" Array
    ReDim Preserve Output(UBound(DataList, 1), 3)
    counter = 1
    For i = 1 To UBound(DataList, 1)
        DataList(i, 1) = DataList(i, 1) & "-" & DataList(i, 2)
        If Not IsError(Application.Match(DataList(i, 1), ProductListO(), 0)) Then
            Output(counter, 1) = DataList(i, 2)
            Output(counter, 2) = DataXcol(i, 1)
            Output(counter, 3) = DataZcol(i, 1)
            counter = counter + 1
        End If
    Next i
        ' >>> Output to XLout
    XLout.Sheets(1).Range("A2").Resize(UBound(Output, 1), 3) = Output()

    Application.StatusBar = "Total Time to review " & UBound(DataList, 1) & " lines = " & Timer - TimeCount

End Sub

It does the following

  • Is stored on “Excel_Out.xls”
  • Opens both “Excel_In#.xls” workbooks
  • Stores all required data in arrays
  • Identifies data on XLin2 whose “company&productname” exist on XLin1
  • Outputs that data to “Excel_Out.xls”

This is how it looks:

enter image description here

Related Solutions

Java simple accounting program [closed]

There's a spelling mistake (an extra 'e' on the input parameter annualInterest*e*Rate): public void setAnnualInterestRate(double annualInteresteRate){ this.annualInterestRate = annualInterestRate; } Therefore you are setting the this.annualInterestRate to...

pc and mobile site code using user agent

You could do something like this through media queries <link rel="stylesheet" media="screen and (min-device-width: 1024px)" href="https://stackoverflow.com/questions/21937880/pc.css" /> <link rel="stylesheet" media="screen and (max-device-width:...

How many CPU ticks does it take to store a single None? [closed]

In short: In this case, since X does not implement Drop, storing None is done in a single instruction. The number of CPU cycles depends heavily on the exact hardware and on caching effects. But assuming a modern x86-64 CPU and assuming the memory is in L1 cache...

sql aggregate and split [closed]

This uses STRING_AGG to aggregate the strings into one long one, and then a Tally Table to split into the new rows. It is assumed you have a column to order by. if you do not, you cannot achieve what you are after without one as data in a table is stored in an...

What is “-bash: !”: event not found”

You can turn off history substitution using set +H. ! is a special character to bash, it is used to refer to previous commands; eg, !rm will recall and execute the last command that began with the string "rm", and !rm:p will recall but not execute the last...

Visualize a sparse matrix using Python Turtle graphics

What does the problem mean by (rows-1, columns-1)? This is tied up with your mysterious m variable and the order() function you left undefined. Let's proceed anyway. We can see from the matrix() function we're dealing with a square matrix but let's not even...

What does an “extra” semicolon means? [duplicate]

A for loop is often written int i; for (i = 0; i < 6; i++) { ... } Before the first semicolon, there is code that is run once, before the loop starts. By omitting this, you just have nothing happen before the loop starts. Between the semicolons, there is the...

Python: Compare lists, perform operation, create new list

I think the downvotes are justified because you did not provide any code at all to show us what you tried so far, so that we know where you are stuck with your code. Also i think the way you wrote the requirements is a bit confusing (0eth probably means nth...

Python put multiple line array into a single line

I think you did not the best job at describing the problem, it is not really clear why you have such files (i guess most people were assuming you have an existing array in python) containing an array definition as if it was written for a specific language with...

Sort array contains numeric string using c# array

You need give an order (actually a equivalence relation) to be able to sort. The order on characters is usually a,b,c,... and the order usually given on words such as 'one' is called lexicographic order. However you want to sort by the meaning behind, its...

Sort (hex) colors by Hue [closed]

In this updated example also we don't care about hash, we'll just deal with an arrays of hex colors: require 'paint' #intantiate web_colors from gist and shuffle them eval(`curl...

read file string and store in uint8_t array in c [closed]

Is that what you wanted? (to test give binary 01 combination as an first argument) #include <stdio.h> #include <stdint.h> uint8_t charToBin(char c) { switch(c) { case '0': return 0; case '1': return 1; } return 0; } uint8_t CstringToU8(const char *...

my JavaScript using jQuery is not working [closed]

You didn't include jQuery in your code: <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script> <script type="text/javascript" src="https://stackoverflow.com/questions/43397734/temp.js"></script> Make...

How to check if a Date falls in a certain week or Quarter [closed]

I would say do these steps, 1 - figure out first and last day of the week Get current week start and end date in Java - (MONDAY TO SUNDAY) 2 - check if the date falls in between that range or not Java: how do I check if a Date is within a certain range? - if...

How to add two object together

You should use a numeric data type for arithmetic operations (not Object). With out seeing the code in add() my recommendation is to store the total price in a double primitive. double price =0; for(int x = 0; x < rprice.size(); x++) { //you may need to...