Migrate from S3Proxy to Azure Bucket Storage
Some LogScale clusters are currently operating with a hybrid approach to bucketing where the cluster is sending bucketing REST calls using the S3 protocol supported in LogScale's S3 solution, but then converting those requests into Azure compliant requests using an external proxy. With the introduction of support for Azure bucket storage, it will be necessary to convert these clusters to use it.
Note
CrowdStrike recommends you work with LogScale Product Support when planning your migration.
Limitations for switchover
Normally, something like this could be done with a simple configuration change. But in this instance the old configuration entities will still be present upon restart because Azure is a new entity, so there is a high possibility that global could become out of sync during the restart process, and potentially some nodes could try to use the S3 code at the same time as others are trying to use the Azure code. In the past, LogScale has accounted for this by doing an intermediary release, where the entities were converted programmatically upon startup, then switch to the official release version afterwards. This is not always feasible if the cluster in question is very large and the restart process takes a very long time; the downtime required for two restarts may be more than the customer can comfortably accept. As such a more manual approach is necessary.
How to migrate from S3Proxy to Azure
The process involves a gradual switch by doing config update. This leaves the old entities in place and any of them with the old config will use the S3 code and send to the proxy. The cluster will run in this state until all parties feel completely comfortable with the results. This approach requires an additional enhancement in the form of an API endpoint to update any segment entities AND uploaded files(both repository files and shared files) still using the old bucket entity IDs to point towards the new bucket entities. LogScale can use the endpoints to both update the segments/uploaded files after the migration, as well as use it in the emergency case that roll back to the S3Proxy implementation is necessary. In such a case, the new segments and uploaded files created with the Azure entity can be set back to S3.
At a high level, the process involves the following:
Shut down the cluster, if required.
Update the config to use the native Azure bucketing and start the cluster.
Once it is validated that the newly created segments are using the Azure implementation successfully, use the bucket-storage-target endpoint to get the bucket entities and map the old S3Proxy IDs to the newly created Azure entities.
Use the new endpoint to bulk update the segments bucket-by-bucket (ex: We know that bucket ID 15 is an S3Proxy entity and its new equivalent is bucket ID 23, we will run the new endpoint with those parameters and all segments pointing towards 15 will now point to 23.
Use the new endpoint to bulk update the lookup files bucket-by-bucket. For example, we know that bucket ID 15 is an S3Proxy entity and its new equivalent is bucket ID 23, we will run the new endpoint with those parameters and all uploaded/lookup files pointing towards 15 will now point to 23.
The detailed steps are:
Run the below API to validate the segments/uploaded files.
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/bucket-storage-target -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"This should return results with the old S3 configuration and the new Azure configuration, such as:
[{"bucket":"bucket-name","id":"1","keyPrefix":"prefix/","provider":"s3","readOnly":false,"region":"eu-central-1","segmentsUsingBucket":121,"uploadedFilesUsingBucket":12}]The
segmentsUsingBucketanduploadedFilesUsingBucketcount denote the segments using the S3 configuration and uploaded files using the S3 configuration respectively. There may not be anyuploadedFilesUsingBucketdepending on whether there were any uploaded files (repository files and shared files).Shut down the cluster, if required.
Perform a configuration update of the cluster by removing any S3 specific environment variables being used for connecting to Azure via S3Proxy and replacing them with the Azure specific variables.
S3 specific variable Azure specific variable Description of Azure variable AZURE_STORAGE_ACCOUNTNAME Azure storage account name AZURE_STORAGE_ACCOUNTKEY Azure storage account key for AuthN/AuthZ S3_STORAGE_ENDPOINT_BASE AZURE_STORAGE_ENDPOINT_BASE Azure storage account blob endpoint (example: "https://<AZURE_STORAGE_ACCOUNTNAME>.blob.core.windows.net/") S3_STORAGE_ACCESSKEY S3_STORAGE_SECRETKEY S3_STORAGE_BUCKET AZURE_STORAGE_BUCKET Azure blob container name; same value as S3_STORAGE_BUCKET S3_STORAGE_ENCRYPTION_KEY AZURE_STORAGE_ENCRYPTION_KEY Same value as S3_STORAGE_ENCRYPTION_KEY S3_STORAGE_PATH_STYLE_ACCESS S3_STORAGE_REGION S3_STORAGE_OBJECT_KEY_PREFIX AZURE_STORAGE_OBJECT_KEY_PREFIX Same value as S3_STORAGE_OBJECT_KEY_PREFIX S3_STORAGE_USE_HTTP_PROXY AZURE_STORAGE_USE_HTTP_PROXY Same value as S3_STORAGE_USE_HTTP_PROXY USE_AWS_SDKhas to stay until the migration to Azure Configuration has been successfully completed (i.e., until there are zero segments using the old S3 configuration).Restart the cluster if the cluster was shut down while updating the environment variables. After restart, the LogScale cluster should be running with the new configuration. The old segment entities would still be in place with the old S3 configuration and the new segment entities created after the configuration change would have the new Azure configuration.
Validate that the newly created segments and any new uploaded files are using the Azure configuration by running the curl command below:
curl -v -X GET $YOUR_LOGSCALE_URL/api/v1/bucket-storage-target -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"This should return results with the old S3 configuration and the new Azure configuration, such as:
[{"bucket":"bucket-name","id":"1","keyPrefix":"prefix/","provider":"s3","readOnl y":false,"region":"eu-central-1","segmentsUsingBucket":36,"uploadedFilesUsi ngBucket":12},{"bucket":"bucket-name","id":"2","keyPrefix":"prefix/","provider": "azure","readOnly":false,"region":"eastus","segmentsUsingBucket":108,"uplo adedFilesUsingBucket":2}]Note the "id" of both the configurations: S3 has id = 1, and Azure has id = 2 in the JSON above. The
segmentsUsingBucketkey for Azure provider should have non zero values at this point, implying that newly created segments are using the Azure configuration, and the older ones (prior to the configuration change) are using the S3 configuration. There may be more than one S3 entity, so take inventory of each one and their Azure counterparts by making sure all the rest of the config matches between the pairs. The ID number pairs will be needed later. TheuploadedFilesUsingBucketdenotes the number of uploaded files(both repository and shared files) which may or may not be zero depending on number of uploaded files.Let the cluster run for 30 mins as a bake time and monitor during the time.
For migrating the older segments using the S3 configuration, run the below API which will bulk update the older segments to use the new Azure configuration. Note that you will need the old configuration bucket ID and the new ID you got from the step above. (In the example above, the old ID in the above curl result is 1 and the new ID is 2.)
curl -v -X POST "$YOUR_LOGSCALE_URL.humio.cicd.dc/api/v1/bucket-storage-target/up date-segments-storage-target?oldStorageTargetId=<old_id_here>&newSt orageTargetId=<new_id_here>" -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"Ensure that the API run was successful. You should get a HTTP/2 204 back.
Repeat this step for each unique S3 bucket entity; refer to the mappings you made in this step above.
To migrate the older uploaded files using the S3 configuration, run the below API which will bulk update the older uploaded files to use the new Azure configuration. Note that you will need the old configuration bucket ID and the new ID you got from the step above. (In the example above, the old ID in the above curl result is 1 and the new ID is 2.)
curl -v -X POST "$YOUR_LOGSCALE_URL/api/v1/bucket-storage-target/update-uploaded -files-storage-target?oldStorageTargetId=<old_id_here>&newStorageTarg etId=<new_id_here>" -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"Ensure that the API run was successful. You should get a HTTP/2 204 back.
Repeat this step for each unique S3 bucket entity; refer to the mappings you made in this step above.
Monitor the cluster. It will take some time for all the old segments and uploaded files to migrate to the new Azure configuration depending on the number of segments and uploaded files. To monitor the progress, run the bucket-storage-target API to get the segments:
curl -v -X GET $YOUR_LOGSCALE_URL.humio.cicd.dc/api/v1/bucket-storage-target -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"The
segmentsUsingBucketanduploadedFilesUsingBucketfor S3 provider should gradually reduce to zero, indicating that there are no older segments and uploaded files using the S3 configuration anymore and all of them have been migrated to use the new Azure configuration.[{"bucket":"bucket-name","id":"1","keyPrefix":"prefix/","provider":"s3","readOnl y":false,"region":"eu-central-1","segmentsUsingBucket":0,"uploadedFilesUsin gBucket":0},{"bucket":"bucket-name","id":"2","keyPrefix":"prefix/","provider":"a zure","readOnly":false,"region":"eastus","segmentsUsingBucket":144,"upload edFilesUsingBucket":14}]Keep monitoring the cluster until you see the
segmentsUsingBucketanduploadedFilesUsingBucketfor S3 provider as zero.
Emergency rollback
If necessary, you can roll back the cluster from the new Azure configuration to the S3 configuration using S3Proxy. This would be used when there are any unintended behaviors during the migration to Azure configuration. On a high level it would be the reverse of migration steps.
Shut down the cluster, if required.
Revert the configuration variables, changing the Azure ones back to S3, making sure that no Azure configuration variables exist now.
Restart the cluster if the cluster was shut down while updating the environment variables. At this point the LogScale cluster should be running with the new S3 configuration (Azure configuration is the older one in this case). The old segment entities and uploaded files would still be in place with the old Azure configuration and the new segment entities and uploaded files created after the configuration change would have the new S3 configuration.
Validate that the newly created segments and any new uploaded files are using the S3 configuration by running the curl command below:
curl -v -X GET $YOUR_LOGSCALE_URL.humio.cicd.dc/api/v1/bucket-storage-target -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"This should return results with the old Azure configuration and the new S3 configuration, such as:
[{"bucket":"bucket-name","id":"1","keyPrefix":"prefix/","provider":"s3","readOnl y":false,"region":"eu-central-1","segmentsUsingBucket":108,"uploadedFilesU singBucket":12},{"bucket":"bucket-name","id":"2","keyPrefix":"prefix/","provide r":"azure","readOnly":false,"region":"eastus","segmentsUsingBucket":36,"uplo adedFilesUsingBucket":2}]Note the "id" of both the configurations: S3 has id = 1, and Azure has id = 2. The
segmentsUsingBucketkey for Azure provider should have non zero values at this point, implying that newly created segments are using the S3 configuration, and the older ones (prior to the configuration change in steps 2) are using the Azure configuration. Again, there may be multiple, so refer to the mappings you made in the migration. TheuploadedFilesUsingBucketdenotes the number of uploaded files (both repository and shared files) which may or may not be zero depending on number of uploaded files.Let the cluster run for 30 mins as a bake time and monitor during the time.
To migrate the older segments using the Azure configuration, run the below API which will bulk update the older segments to use the new S3 configuration. Note that you need the old configuration bucket ID and the new ID you got from the step above. (In this example, the old ID in the above curl result is 2 and the new ID is 1.) Remember the old ID and new ID would be switched in the rollback case. The old ID would be 2 and new ID would be 1 (since you are switching back from Azure to S3).
curl -v -X POST "$YOUR_LOGSCALE_URL/api/v1/bucket-storage-target/update-uploaded -files-storage-target?oldStorageTargetId=<old_id_here>&newStorageTarg etId=<new_id_here>" -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"Ensure that the API run was successful. You should get a HTTP/2 204 back.
Repeat this step for each unique bucket entity pairing.
Monitor the cluster. It will take some time for all the old segments and uploaded files (if any) to migrate to the new S3 configuration depending on the number of segments. The time required depends on the cluster size, it could range from a few minutes to days. To monitor the progress, run the bucket-storage-target API to get the segments:
curl -v -X GET $YOUR_LOGSCALE_URL.humio.cicd.dc/api/v1/bucket-storage-target -H "Authorization: Bearer <YOUR_API_TOKEN>" -H "Content-Type: application/json"The
segmentsUsingBucketanduploadedFilesUsingBucketfor Azure provider should gradually reduce to zero, indicating that there are no older segments using the Azure configuration anymore and all of them have been migrated to use the new S3 configuration.[{"bucket":"bucket-name","id":"1","keyPrefix":"prefix/","provider":"s3","readOnl y":false,"region":"eu-central-1","segmentsUsingBucket":144,"uploadedFilesU singBucket":12},{"bucket":"bucket-name","id":"2","keyPrefix":"prefix/","provide r":"azure","readOnly":false,"region":"eastus","segmentsUsingBucket":0,"uploa dedFilesUsingBucket":0}]Keep monitoring the cluster until you see the
segmentsUsingBucketanduploadedFilesUsingBucketfor Azure provider as zero.