Talk:Presto (SQL query engine)

Latest comment: 1 year ago by Brianolsen2 in topic Merger proposal

Users with Conflict of Interest edit

This lists users with a conflict of interest.

Brianolsen2 (talk) 21:33, 3 October 2022 (UTC)Reply

Let's stop removing factual content from the Presto Wikipedia article edit

Recently someone in the Wikipedia community has been repeatedly removing factual content from the Presto (SQL Query Engine) page. This includes reference to the Presto Software Foundation, references to the creators of Presto, and links to contributors to the project, and links to the Presto Software Foundation project page.

There is some disagreement among developers where the code contributions to Presto should go. Like many open source projects, there are many repos (also sometimes referred to as "forks") in GitHub. The two main active ones are: https://github.com/prestosql/presto https://github.com/prestodb/presto

At this point in the discussion, I don't think it's worth summarizing the history of the repos, show the data as to who mainly contributes to each repo, nor discuss which code is shared or borrowed from each active repo. This discussion is to point out that the factual pieces of the article on Presto is being removed repeatedly. It's been done several times such that it appears intentional. I believe factual content should remain in Wikipedia regardless of opinion on the project structure.

As a separate thread, we should discuss how to agree on content where it's important to know the differences between prestodb vs. prestosql. For example, currently links to both community sites are both listed which I think is fair. It is not fair to remove one in favor of the other. This has been repeatedly done and should stop.

I believe this can and should be resolved amicably and we should not resort to removing factual content that one does not like. Mattsfuller (talk) 03:57, 30 May 2019 (UTC)Reply


Mattsfuller please remember to sign your posts on the talk page.

The article should link to both instances of the databases equally and it should correctly document that PrestoSQL is a fork of PrestoDB. It shouldn't try and present any Presto implementation as the one true implementation. Does this sound reasonable to you? What do you think of my suggestions in the other section?

Can you give a reason for why any of the above should not be documented in the article? What isn't factual or relevant? Adweisbe (talk) 23:22, 29 May 2019 (UTC)Reply


Thanks for the reminder Adweisbe. I edited to sign my post above.

My concern for this section was not the PrestoDB vs. PrestoSQL. My concern was the willful acts of removing Martin, David, Dain, and Eric as the original authors to Presto, removing the sentence about the formation of the Presto Software Foundation, removing the references to the press release regarding the Presto Software Foundation, and listing Facebook as the only contributors to the project. Doing so is against the core purpose of Wikipedia.

Regarding your questions about PrestoDB vs. PrestoSQL I will respond to that in the other section.

Mattsfuller (talk) 04:05, 30 May 2019 (UTC)Reply

Documenting PrestoSQL and PrestoDB edit

I want to to start a discussion on how to document PrestoDB and PrestoSQL. Mattsfuller, Findepi, Electrum. It seems to me like we should include information about both and how they split into separate software projects. Because they are separate projects I think we want to link to separate pages for each project and only put information common to both here.

PrestoSQL is a fork of PrestoDB. Presto came into this world as PrestoDB and PrestoDB is still around. It wasn't renamed to PrestoSQL. Trying to make it look like PrestoDB is not around anymore doesn't reflect the reality that there are two active Presto projects and that one is a fork of the other. This is the objective truth supported by various citable sources (blog posts, articles, commit history for both projects).

WRT to the Website link at the top right. How about we link to both?

WRT to the latest release version. How about we remove that entirely? Adweisbe (talk) 20:06, 29 May 2019 (UTC)Reply


Thank you Adweisbe for starting this discussion.

I think it's certainly fair and accurate to acknowledge different flavors of Presto such as PrestoDB and PrestoSQL. I agree that it is a fact that PrestoDB, originally at github.com/facebook/presto and later moved to github.com/prestodb/presto, was the original repository for the project. This is not what I was disputing and happy to acknowledge. I was disputing what I described in the other section about someone intentionally and repeatedly deleting factual content around the creators of Presto and existence of the Presto Software Foundation. I want to make the distinction clear.

It was not my intention to make PrestoDB look like it is not around anymore and I don't feel that I did it in the way being interpreted. I added information and did not replace. I started participating in the community around fall 2014, and I helped originally create this article in 2015, and contribute over the years. When the PSF was formed and I reformatted the page with the top right box, it seemed obvious to me to point the latest software version to the PrestoSQL. From my interactions and discussions with Facebook Inc. leadership, my understanding was that there was no intention for their employees to continue to participate in the community around Presto. Therefore I did not think much of it when I added the version download and website link on the top right to point to PrestoSQL. Again, this was new information I added and did not replace. Further, I did not remove the link Facebook's community page in the External Links section. However, either I was misinformed or decisions at Facebook Inc. changed, because it is clear that employees intend to participate in a community. The failure was to not appropriately update the page to have both once I became informed of the community intention. I think we can easily reconcile this per your suggestions.

I am glad to have this discussion and happy to acknowledge the different Presto variations where appropriate. And refer to them appropriately. I am not and never have doubted the objective truth. I only ask that others respect the same with regards to additional flavors of Presto (i.e. PrestoSQL).

Per your suggestions:

> WRT to the Website link at the top right. How about we link to both?

I agree. As an alternative, maybe we keep both in external links rather than on the top right. I think it may be confusing and also look aesthetically odd IMO.

> WRT to the latest release version. How about we remove that entirely?

I agree.

Mattsfuller (talk) 05:10, 30 May 2019 (UTC)Reply


>> WRT to the Website link at the top right. How about we link to both?

> an alternative, maybe we keep both in external links rather than on the top right.

Mattsfuller yes, i think it's better.

Adweisbe, saying "PrestoSQL is a fork of PrestoDB" implies some meaning which is not necessarily objective truth. If you consider the fact that *all* top Presto contributors are working on PrestoSQL, you could quite easily say the contrary: that "PrestoDB is a fork of PrestoSQL". I think we could express the objective truth without implying additional meaning. Perhaps, something like this: "in January, the Presto project was split into two projects. Creators of Presto founded Presto Software Foundation and created the PrestoSQL repository. Facebook continues to maintain the PrestoDB repository."

Findepi (talk) 07:49, 30 May 2019 (UTC)Reply


That is a valid point Findepi. The term "fork" may not an appropriate term to use in this situation. By strict definition, PrestoSQL may resemble a fork. But using such a term implies more than what may be intended. There are many forks of Presto for a variety of purposes. Fork can also be used a as divisive term to stir emotion. Therefore using the term also deserves a longer explanation such as the vast majority of people who have contributed most code to Presto no longer contribute to PrestoDB and now contribute to PrestoSQL (as you pointed outFindepi). Or that pieces of the code are cherry picked in both directions. There would be a lot to work out so that the reader has a complete understanding if such a term is used. Therefore I suggest avoiding the term "fork" entirely. Mattsfuller (talk) 11:43, 30 May 2019 (UTC)Reply


> WRT to the Website link at the top right. How about we link to both?

> WRT to the latest release version. How about we remove that entirely?

It seems that we all agree on these points, so I removed the release version and added links to both websites.

> It seems to me like we should include information about both and how they split into separate software projects.

I also agree that this would be worthwhile. Do you have any suggestions on what sources would be appropriate for such content?

Electrum (talk) 08:20, 1 June 2019 (UTC)Reply


Electrum thank you for adding the link this weekend.

Electrum, Mattsfuller, Findepi Instead of calling something a fork how about a more explicit description of the current state of things? I think we can avoid using controversial language while better communicating the state of things.

Currently this line in history reads

In January 2019, the Presto Software Foundation was announced. The foundation is a not-for-profit organization dedicated to the advancement of the Presto open source distributed SQL query engine[4][5].

What if it was updated to read something like.

In January 2019, the Presto Software Foundation was announced. The foundation is a not-for-profit organization dedicated to the advancement of the Presto open source distributed SQL query engine[4][5]. Development of Presto continues independently with PrestoDB owned by Facebook and PrestoSQL owned by the Presto Software Foundation with some cross pollination of code.

There are two key points I want to add to the article. That development of the two projects continues independently, and that there is some cross pollination of commits (as mentioned by Mattsfuller). I think it's also worth clarifying who owns PrestoDB, and who owns PrestoSQL so that it's understood what the links in info box are about.

It might even be worth rethinking the links in the info box so they they somehow communicate what they are without requiring people to read the article.

WDYT? Adweisbe (talk) 17:00, 3 June 2019 (UTC)Reply


Thanks Adweisbe. I think that will solve potential controversial language and succinctly describes the scenario. I think it's good. I'd be interested to hear Electrum and Findepi thoughts on your suggestion as well. Instead of "owned by" I would suggest the language "maintained by." Perhaps we remove the links from the infobox all together? I'm not sure to communicate it within there either.

Mattsfuller (talk) 02:14, 4 June 2019 (UTC)Reply


I agree RE "owned" vs "maintained". I would rather have the links in the info box and do nothing. I know I end up using them a lot. I hate having to search through the article for that sort of thing. Adweisbe (talk) 18:30, 4 June 2019 (UTC)Reply


I like that sentence and agree that it seems to be a neutral, succinct explanation of the current state. Thanks for adding it. Electrum (talk) 04:31, 8 June 2019 (UTC)Reply


Shouldn't the recent formation of the Presto Foundation (confusingly similar name to Presto Software Foundation) be included in this article?

See https://www.linuxfoundation.org/press-release/2019/09/facebook-uber-twitter-and-alibaba-form-presto-foundation-to-tackle-distributed-data-processing-at-scale/ for an announcement? TedDunning (talk) 18:11, 12 November 2019 (UTC)Reply


I added a line for it. Adweisbe (talk) 20:04, 12 November 2019 (UTC)Reply

Trino vs Presto edit

I have redirected Trino (SQL query engine) to Presto (SQL query engine) (see diff to avoid a WP:CONTENTFORK. The new Trino article was started as a fork of this one, and it just added even more confusion about Presto and Trino. I think it would be better to improve this article, in particular the History section, and discuss in this talk page how the content about Presto and Trino should be split. MarioGom (talk) 10:55, 3 October 2021 (UTC)Reply


Hi MarioGom.

Thanks for clearing up the way to go about this and I apologize for doing so outside of the Wikipedia policies. I still a little new and figuring out the right way to go about this.

I am definitely close to the Trino project as I am a contributor to the project and work as a Developer Advocate at a company the builds an enterprise version of the project. I will figure out all the ways I need to comply to make updates or suggestions moving forward.

As per the message you sent me, it seems since I have a clear conflict I am limited to suggesting edits but I'm not sure who can add these changes at the end of the day.

Trino is now a separate entity and is not Presto. The projects have no intention of rejoining in the same way https://en.wikipedia.org/wiki/Jenkins_(software) is now a different project from https://en.wikipedia.org/wiki/Hudson_(software). So the concern for a https://en.m.wikipedia.org/wiki/Wikipedia:CONTENTFORK is not applicable in this case.

What are the steps forward we can take to get the Trino page reestablished and who has the authority to do so if I have a conflict of interest? There is actually more confusion added if these projects are not separate entities.

I made an initial attempt to make a differing page that described Trino and Brianolsen2 (talk) 19:50, 4 October 2021 (UTC)Reply


Hi Brianolsen2: First of all, I left a message in your talk page about paid editing requirements. Note that you can propose changes, and discuss them in the talk page like any other editor. You should generally avoid making these changes directly.
I'm aware of the Trino/Presto dispute, as well as the Jenkins/Hudson one. Thank you for bringing that example up. My main issue with the version of the Trino article you created initially is that it felt like retroactively rewriting history, as well as copying content in a way that both articles offered parallel realities.
We should start with what reliable sources say about the topic. Is there any reliable source that says that Trino is the original Presto, and that the current Presto is a fork of the original Presto? Or any similar claim? How is the naming dispute and fork described by such sources? MarioGom (talk) 17:44, 7 October 2021 (UTC)Reply

Hi MarioGom, It's been a while. I'm still working out the best way to get this cleared up and we have looked for a Wikipedian in the Trino community that might be able to help here with little luck. I wanted to follow up on a few questions.

  1. Should the CEO of Ahana StevenMih88 be able to make edits on the Presto page since there is a clear conflict of interest as there was with my case?
  2. I wanted to know if we provide reliable sources, would I be able to ask for your help to create a small initial Trino software page to at least distinguish between the two forks of the software? I'm curious to know if this blog created by the founders of both Trino and Presto would be enough to validate the story behind the project split. These sources are reliable as they are the founders of the project as can be seen by the GitHub contribution graphs for both Trino and Presto. Martin (martint), Dain(dain), and David (electrum) are in the top 4 contributors and have the longest history in both repositories. They are responsible for creating both the original project and the Trino fork. Would accounts from them be good enough to establish that Trino is a reputable fork and would you be willing to make a separate page so that it clearly distinguishes between the two projects?

    I have revised my initial version of the Trino page in my sandbox as a suggestion on the initial Trino page. I've
    • Removed all redundant segments of the History section with the Presto page to focus on Trino's history starting from 2019 when the fork was created.
    • Refer to the Presto History for the remainder of the back story.
    • Removed the Use Case section for now and we can revisit this later.
    • Add a lot more citations using the Trino: The Definitive Guide book.
  3. Would it also be possible for us to list these forks on this List of Software Forks Wikipedia page?

Thanks for all the help MarioGom!

Brianolsen2 (talk) 17:04, 25 August 2022 (UTC)Reply

Also want to extend the request to W_Nowicki if you are interested in helping. I notice you've done some previous work related to Presto and Trino pages in the past. Thanks! Brianolsen2 (talk) 15:51, 1 September 2022 (UTC)Reply
Hey @Uhai! I saw some of the edits you added to this page. Would you be interested in helping us create a separate page for the Trino project? Brianolsen2 (talk) 17:31, 20 September 2022 (UTC)Reply

Thank you Smga3000 for taking my edits into consideration! Brianolsen2 (talk) 20:08, 3 October 2022 (UTC)Reply


I know I'm not supposed to make edits on this page as I am part of the Trino Software Foundation, as well as, a Starburst employee. I have undone some edits put forth by the CEO of Ahana, StevenMih88 as there is a clear conflict of interest. I will continue to monitor this page and remove anything he posts henceforth until we can get a proper moderator. If you would like to help moderate this page and help me with some of the items above to create a separate page for Trino it would be greatly appreciated!

After a discussion in another forum, one of the disputes is around this claim:

> Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation.

It would be good for someone not affiliated with either project or vendor around either project to review this claim and determine if it should still remain on the page.

Brianolsen2 (talk) 11:25, 21 September 2022 (UTC)Reply



Hi MarioGom - StevenMih88 here - I'm also requesting a review by you here on the talk page as it relates to 3 items. Thanks in advance:

1) My edit which was deleted by Brianolsen2: "Presto Foundation is an open community and everyone is welcome to participate as long as they abide by the code of conduct."

2) My edit, also undone: Initial sentence, instead of "Presto (including PrestoDB, and PrestoSQL which was re-branded to Trino)" Requesting consideration as there is a separate Trino page published now, to make this page less confusing and focused on Presto to be revised to:

      "Presto (including PrestoDB)"

Note: Trino is now in the See Also section.

3) and the reviewing the claim that: > Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation.

Thanks again for your objective consideration. StevenMih88 (talk) 05:08, 4 October 2022 (UTC)Steven MihReply

  • Specific text to be added or removed: Remove text "Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation."
  • Reason for the change: This claim is controversial and can likely lead to a WP:Battle. The statement doesn't support or help readers understand Presto or Trino and isn't WP:NEUTRAL
  • References supporting change: This claim made by Piotr (top 5 contributor to both projects) states that neither they nore the creators Martin, Dain, and David were invited[1]. Later in 2020 there was a vacuous invitation to merge the Trino project into the Presto project under the terms that Martin, Dain, and David would no longer be on the steering committee. This was extended years after the contending project was established and from the perspective of the creators was posturing of the Presto community to show that they aimed to join the communities. It seems pretty clear why we're pushing two different accounts but this really doesn't matter to anyone who wants to use Presto or Trino so I think removing either claim is a benefit to everyone.

Brianolsen2 (talk) 16:40, 4 October 2022 (UTC)Reply

References

  1. ^ Findeisen, Piotr. "What is the relationship of prestosql and prestodb? · Issue #380 · trinodb/trino". GitHub. Retrieved 4 October 2022.

I agree with StevenMih88's first and third requests above and I have created a formal request edit for the third one as it will make the page less divisive.

I disagree that we should remove "Presto (including PrestoDB, and PrestoSQL which was re-branded to Trino)". For about two years after the projects split, Trino was branded as PrestoSQL vs the original PrestoDB project before it was renamed to Trino[1]. There are many people using the PrestoSQL name that are unknowingly using earlier versions of the Trino fork. It will help many to know if they are using PrestoSQL and which version they have actually been using. In fact, having a clear location on both pages where we delineate the differences of both project and cross reference will help users in general. Despite Trino being included in the Presto_(SQL_query_engine)#See_also section, they might not know to investigate that version if they are on PrestoSQL. So keeping that distinction is important, just as the distinction is made in the Trino_(SQL_query_engine)#History section.

Brianolsen2 (talk) 17:02, 4 October 2022 (UTC)Reply

  Not done: This looks like a content dispute between COI editors. Denied. Quetstar (talk) 23:48, 6 October 2022 (UTC)Reply

Merger proposal edit

Frap has proposed to merge the Presto article into the Trino article. A merger proposal wasn't added so I am adding one and am happy to let Frap describe their reasons for this proposal. Until then I will list some of the pros and cons to this in my opinion as a Developer Advocate on the Trino project and employee at a vendor built a top of Trino, Starburst.

To me, these pages are separate for a reason. The projects have distinct foundations (Presto with the Presto Foundation under the Linux Foundation and Trino under the Trino Software Foundation). The projects have diverged significantly [2]. There are a significant number of projects that exist in Trino (such as support for fault-tolerance [3]), that do not exist in Presto. Likewise, Presto has started to move their efforts to supporting integration with Meta project Velox[4] which Trino does not plan on supporting.

Many other projects[5] have a separate wiki page for multiple forks. Trino and Presto have already been added to this list, to help clarify this.

The biggest advantage to keeping these two separate is that people interested in learning more about the technologies have a clear understanding that these are now two distinct projects that share history. Depending on the context that they are learning, it can be confusing if someone searches for Trino and winds up on a Presto page to a different project and also it would be just as confusing if they were searching for the Facebook Presto project and wound up on a Trino page. The articles were written in such a way that they reference each other well and make the shared history clear while providing facts about each of the individual projects moving forward. This also avoids Wikipedia:Content_forking.

For these reasons, I propose against merging these two articles.

Brianolsen2 (talk) 21:22, 24 February 2023 (UTC)Reply

You bring up some good points and I can buy your arguments. The reason I proposed the merger was that both articles were almost identical. If Trino and Presto diverge enough that the content of the articles were to be different then they would warrant separate articles. Frap (talk) 14:30, 25 February 2023 (UTC)Reply
Yeah, there are plenty of differences to be discussed that I would like to point out on both pages. If you would be interested in writing them I can provide suggestions and sources, but I myself can't write about it as I have a conflict of interest. Would you want to help showcase the differences between these projects? Brianolsen2 (talk) 16:13, 25 February 2023 (UTC)Reply
Frap, any thoughts here? Would love to get some help updating these pages so they aren't as identical and really highlight the differences to lighten the load of folks learning about them. Brianolsen2 (talk) 20:18, 28 February 2023 (UTC)Reply
Once clear change we probably should make is removing the Trino architecture from the Presto page. This information is redundant and as shown in the picture, actually describes Trino's architecture versus Presto's.
This will make the articles look different since that was copied in from the Trino page. Brianolsen2 (talk) 18:46, 14 March 2023 (UTC)Reply
I'm going to remove the merger proposal for now since this conversation has gone stale. Brianolsen2 (talk) 20:29, 20 April 2023 (UTC)Reply