Improve CSV performance (SFTP to Salesforce)

  • 11Views
  • Last Post 2 weeks ago
0
votes
Christoph Eschweiler Apracor GmbH posted this 2 weeks ago

Hello,

we have a package that imports around 1 mio. rows daily from CSV files from an SFTP server into Salesforce.

The 1 mio rows are unevenly distributed through 15 CSV files. 

Each CSV file maps to a distinct object in Salesforce. 

There are no lookups, expressions, etc. in the package's tasks and no workflows, triggers, etc. listening to the SF objects. Also the objects are not detail objects in a master-detail relationship - they don't have parent objects. We simply move data from CSVs to Salesforce.

We use Salesforce Buik API.

However this package takes around 4-5 hours to execute. 

When I run the same files sequentially through Apex Data Loader I'm done in less than 30 minutes.

Is there any way for use to reduce the execution time of this package? 4-5 hours is way too long for our use case.

What I tried so far without success:

  • Switch to SOAP API
  • Checking "Preserve Task Order" for the package

Thanks

Chris

Order By: Standard | Newest | Votes
0
votes
Mariia Zaharova posted this 2 weeks ago

Hello Chris,

 

If we understood correctly, all the tasks in this package are independent of each other. Did you try splitting this package into several ones? 

The batch size (for both, SOAP and BULK API), as well as execution time, depends on the overall structure and complexity of a package, the number of mapped fields, size of values in records, etc.

 

Please let us know if this solution works for you.

 

Best regards,

Mariia

0
votes
Christoph Eschweiler Apracor GmbH posted this 2 weeks ago

Hello Mariia,

thanks for your response.

Yes, the tasks are independent of each other.

I think I made one thing not quite clear in my original post: The problem comes from the slow performance for each individual CSV file.

I've taken the second-to-largest CSV file (about 130k rows) in a separate package (only task) and let it run there. Unfortuntately it is still very slow. The batch size seems to be 297 - which is very small. The CSV file contains 23 mapped columns - most of which contain numbers in the ranges of 0 to 10000 or short strings.

Is there any way to improve this performance?

Update: 131795 rows took 48 minutes and 50 seconds.

Thanks

Christoph

0
votes
Mariia Zaharova posted this 2 weeks ago

Hello Christoph,

 

Thank you for the details.

We will check if anything can be done from our side and let you know as soon as possible.

 

Best regards,

Mariia

Close