Salesforce records moved to a recycle bin are not replicated (incrementally) as expected

  • 327Views
  • Last Post 14 December 2017
0
votes
Admin IT posted this 28 June 2017

On base of a standard incremental replication package the initial load (i.e. contact) loads all records, except those, which are in the recycle bin (isDeleted = 1). This works as expected.

In the phase of the following incremental load the data is loaded for as specific time window after the last load (datetime) and the current system time on the base of LastModificationDate.

If a user now puts some objects in the Salesforce recycle bin, the LastModificationDate is presumably not affected by this action. Nevertheless it's expected like with the initial load, that binned records also no longer should exist in the target object or at least the isDeleted attribute value should be updated accordingly. 

But given the current situation the time window of the next incremental load (related to LastModificationDate) does not consider those records for replication and the records remain unchanged in the target database (i.e. Azure SQL). This results in a data inconsistency between Salesforce and the target database.

After 15 days or a manual clean-up of the recycle bin, the next replication reflects the change as expected and removes those records also from the target object. Maybe due to the fact, that now Salesforce also updates the LastModificationDate value ...

Skyvia should preferrably consider the SystemModDate instead of the LastModificationDate for the evaluation of potential replication records and/or should additionally always consider all records with isDeleted=1 - independent of any date values.

Order By: Standard | Newest | Votes
0
votes
Mariia Zaharova posted this 29 June 2017

Please refer to: https://developer.salesforce.com/docs/atlas.en-us.api.meta/api/polling_for_changes.htm#topic-title

It says, that "We recommend polling no more frequently than every five minutes. There are built in controls to prevent errant applications from invoking the data replication API calls too frequently".

The behaviour like you described can be reproduced if the packages run directly after the deletion from Salesforce. Are you facing the same behavior or rows are not deleted from the database table even if the next package run starts after a long time period?

0
votes
Admin IT posted this 07 July 2017

Dear Mariia

Initial Testing

We have re-initialized all incremental loads and are now running PowerQuery Reports against Azure SQL replication tables and Salesforce directly. After the initialization everything was fine. Note: we always were waiting 5 minutes after the last Salesforce data modificatin before we have re-run a Skyvia Replication Package.
Our tests with putting items in the recycle bin and take them back afterwards with intermediate replication cycles didn't produce any differences so far.

Ongoing Monitoring with PowerQuery

But we are still monitoring evolving record differences with our PowerQuery after some days. The records with the max date value are in both system the same, but the count differs. Already after one day, we have identified differences, which we are now investigating in more depth to figure out which circumstances may be responsible for.
>> We will keep you informed about our findings in the next 1-2 weeks.

Question SystemModstamp versus LastModificationDate

Independent of the difference issue, why does your incremental logic rely on the attribute LastModificationDate and is not using the more system relevant SystemModstamp attribute, which is always updated by Salesforce, whereas LastModificationDate is not updated under certain circumstances and moreover does not reflect the latest point in time of a modification. SystemModstamp would be more accurate.

With best regards
Michael

0
votes
Mariia Zaharova posted this 11 July 2017

Thank you for the additional information and suggestion regarding using SystemModstamp instead of LastModificationDate. We will consider it and will investigate your issue more clearly. We will contact you as soon as any results are available.

0
votes
Admin IT posted this 21 July 2017

Dear Mariia

We still encounter differences between the number of records in Salesforce and in Azure SQL.
Presumably it's the result of some "improper" cleanup batch jobs and/or related to custom objects from the Salesforce Application PropertyBase V3. Also the cleanup batch jobs are part of the PropertyBase Application.

We are no getting in contact with PropertyBase Consultants to narrow down potential issues caused by this application.

Could this potential Application issue explain replication differences from your point of view ?

Note: Our queries to analyse the differences on base of Excel PowerQuery are always "correct/consistent" for native Salesforce and any PropertyBase objects. But compared to the query logic of the PowerQuery Salesforce Connector, it seems, that Skyvia makes a difference how deleted (recycle bin) objects are recognized/handled in the Skyvia replication.

I'll keep you informed.

Meanwhile maybe you can check, if there are other Skyvia clients with a) Salesforce PropertyBase facing b) similar issues and c) ask your dev to have a closer look into how Salesforce Applications like PropertyBase may handle data differently compared to native Salesforce.

With best regards
Michael

0
votes
Admin IT posted this 08 August 2017

We have run the replication now several more days.

Finally we were not able to reproduce manually, that items in the recycle bin are not properly replicated thru Skyvia. Might it be, that there were some code improvements from your team in the last few days ?

Nevertheless in the long run across multiple days, we encountered still a few cases, where we have consistency issues between Salesforce and Azure SQL. For instance Events in the recycle bin or even definitively purged are still in the Azure SQL table. This might be because we replicate Events with a filter on specific record types.

But even for Contacts and Accounts (without any filtering), we've found entries, which are in the recycle bin or are definitively purged, which still are visible in the replication table in Azure SQL.

Because we use the replicated data in Azure SQL as base for our daily core reporting, it's very important to have a 1 to 1 image of all salesforce data. As we were struggling with whole replication the last view weeks for quite many hours, we have setup another replication in parallel with Azure Data Factory. The incremental replication is based on the SystemModstamp instead of the LastModificationDate. There we didn't encounter any differences - even for items in the recycle bin. Still, we will keep the Sykvia-Replication as backup process, if we should encounter any issues (availability, consistency) with the data factory processing.

With best regards
Michael

0
votes
Mariia Zaharova posted this 10 August 2017

Thank you for all the additional information you've provided. We will continue investigating this behaviour and possibility to use SystemModstamp field instead of the LastModificationDate in replicaion tasks.

As soon as any results are available, we will post here.

0
votes
API User Reporting posted this 05 December 2017

Hello Michael:

 

Were you ever able to get to the bottom of this? I have proven to myself that SkyVia is failing under indeterminate circumstances of recognize records in the Recycle Bin that need to be reflected in delete replication.

My client's organization is encountering a similar problem. It is clear the records are in the Recycle Bin, yet there are circumstances where SkyVia's query to find the qualifying records (for delete) from the Recycle Bin returns no rows.

SkyVia thus far has not been timely in their replies. Furthermore, they don't seem all that concerned about the problem.

Thank you.

Respectfully,

Darryll Petrancuri

0
votes
Admin IT posted this 07 December 2017

Dear Darryll

We couldn't achieve a satisfying working solution with Skyvia (product and dev team) regarding items in the recycle bin.

Therefore we have switched our replication in August 2017 from Skyvia to Azure Data Factory (ADF). To synchronize recycle bin items properly, next to the regular incremental extract to ADF we draw daily a full list of all valid Id values thru a separate query and within Azure SQL we run a cleanup procedure afterwards, to delete items in the extraction tables, which are no longer existing in the overall Id list.
Overall the ADF approach is running faster, provides proper error notifications and is cheaper in operations than the Skyvia approach. Setting up an ADF environment initially takes a bit more time than doing the same with Skyvia, but in the long run ADF remains the preferred solution, as it also avoids the use of additional services (dependency, costs) outside the Azure environment.

Regards
Michael

0
votes
API User Reporting posted this 07 December 2017

Michael:

Thank you for your reply.

Devart's 'support' is wholly unacceptable and they won't even acknowledge the problem.

I've used Workbench to query all rows, not just active and also used Developer Console Anonymous Execution Apex code doing both a query using ALL ROWS and the Salesforce Replication API and have proven that Devart's SkyVia Replication is broken in that there are days where it reports no records have been deleted during the timeframe in question and yet they are clearly accessible from both methods.

Like you, I'm probably going to roll my own. Any chance you'd be willing to share your ADF solution? What did you use for a Saleforce connector?

As an aside, are you aware you can do this now with SSIS as a service now within Azure / Azure SQL Database.

Thank you for your consideration.

Respectfully,

Darryll

 

 

 

0
votes
Simon Bubnov posted this 13 December 2017

We have reproduced and fixed the issue with Darryll's help. We will post here when Skyvia is updated.

0
votes
Simon Bubnov posted this 14 December 2017

 

We have updated Skyvia. Now the issue won't reproduce in any replication package.

Close