Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

oneclickexport: add core export functionality #39813

Merged
merged 2 commits into from
Aug 4, 2022

Conversation

sashaostrikov
Copy link
Contributor

@sashaostrikov sashaostrikov commented Aug 2, 2022

This is the first PR of 1-click export effort.
This includes ExportRequest processing, site config fetching and redacting and creating a zip archive.

‼️ This PR targets the branch of this PR

Closes https://github.com/sourcegraph/sourcegraph/issues/39616 and https://github.com/sourcegraph/sourcegraph/issues/39619

Test plan

New test is added.

@sashaostrikov sashaostrikov requested a review from a team August 2, 2022 12:36
@cla-bot cla-bot bot added the cla-signed label Aug 2, 2022
@sashaostrikov sashaostrikov force-pushed the ao/one-click-export-core branch from cf328a5 to b0cc089 Compare August 2, 2022 15:27
Base automatically changed from ao/redacted-site-conf-format to main August 3, 2022 07:41
This includes ExportRequest processing, site config fetching and redacting and creating a zip archive.
Copy link
Contributor

@varsanojidan varsanojidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Sash


configBytes := []byte(siteConfig.Site)

err = ioutil.WriteFile(dir+"/site-config.txt", configBytes, 0644)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why .txt instead of .json?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, it is just how usually we get this files in customer issues. basically doesn't matter, IMO, I think valid JSONs can have .json, and other text files will be with .txt

Copy link
Contributor

@indradhanush indradhanush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. No blockers.

Sidenote, when introducing totally new code it would be nice to add a short write up on how the new components interact and work with each other. The overview will help reviewers seeing the code for the first time build a mental model.

Comment on lines 15 to 16
// Export function accepts an ExportRequest and returns bytes of a zip archive
// with requested data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Edited for brevity and full sentences.

Suggested change
// Export function accepts an ExportRequest and returns bytes of a zip archive
// with requested data
// Export accepts an ExportRequest and returns bytes of a zip archive
// with requested data.


type OneClickExporter struct {
logger log.Logger
cfgProcessors map[string]Processor[ConfigRequest]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is this easier to read?

Suggested change
cfgProcessors map[string]Processor[ConfigRequest]
configProcessors map[string]Processor[ConfigRequest]

// 1) tmp directory is created (exported files will end up in this directory and this directory is zipped in the end)
// 2) ExportRequest is read and each corresponding processor is invoked
// 3) Tmp directory is zipped after all the Processors finished their job
func (e *OneClickExporter) Export(request *ExportRequest) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to pass the arg as a pointer? Can we pass this as value instead? Do you anticipate this struct to be particularly large in the near future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact it should not be a big one. plenty of fields -- yes. but most of them are just strings under 30 characters, ints and bools, so I think we can remove the pointer here 👍

// 3) Tmp directory is zipped after all the Processors finished their job
func (e *OneClickExporter) Export(request *ExportRequest) ([]byte, error) {
// 1) creating a tmp dir
dir, err := ioutil.TempDir(os.TempDir(), "export-*")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the timestamp as the suffix instead of export-*, that is (export-%s, time.Now()).

This should avoid conflicts when the directory already exists if there's a chance that multiple export jobs have been triggered simultaneously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will not be any conflicts, that's why I picked this function. Here is its doc: https://sourcegraph.com/github.com/golang/go/-/blob/src/os/tempfile.go?L78-80

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh TIL! Thanks for the link. Can you add a note on why the -* in the pattern in that case for future code readers?

return nil
}

// creating file header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Comments are usually better read in the the imperative mood (create, not creating) similar to commit messages.

// 1) tmp directory is created (exported files will end up in this directory and this directory is zipped in the end)
// 2) ExportRequest is read and each corresponding processor is invoked
// 3) Tmp directory is zipped after all the Processors finished their job
func (e *OneClickExporter) Export(request *ExportRequest) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm noticing a lot of redundancy in the naming here which could make it hard when imported and used by other packages. For example:

exporter := oneclickexport.OneClickExporter{}
exporter.Export()

How about something like this instead:

OneClickExporter -> Agent / Actor
Export -> Run

And then this might look like:

exportAgent := oneclickexport.Agent{}
exportAgent.Run()

Or:

exportActor := oneclickexport.Actor{}
exportActor.Run()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to stick to Exporter interface and Export function, because it perfectly describes what is happening.
oneclickexport.OneClickExporter is indeed too much, but Actor or Agent is a bit too abstract IMO, I'll think for a better naming of this one


// Process function of SiteConfigProcessor loads site config, redacts the secrets
// and stores it in a provided tmp directory dir
func (g SiteConfigProcessor) Process(_ *ConfigRequest, dir string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above in regards to pointer vs passing the value directly?

@sashaostrikov
Copy link
Contributor Author

@indradhanush I am merging this PR to unblock other dependent PRs. Basically, I've addressed/replied to all of your commits. If there is anything more, I'll fix it in next commits :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement core archiving logic
3 participants