Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Great idea to create such a repository! #1

Open
maxim-uvarov opened this issue Aug 23, 2014 · 18 comments
Open

Great idea to create such a repository! #1

maxim-uvarov opened this issue Aug 23, 2014 · 18 comments

Comments

@maxim-uvarov
Copy link
Collaborator

I've found it very useful!
Thanks a lot!

@KiaraGrouwstra
Copy link
Owner

Thanks, I appreciate it. :)
If you find any other useful function feel free to add!

@maxim-uvarov
Copy link
Collaborator Author

Thanks! I'll try to find out where I can be useful

@tgb417
Copy link

tgb417 commented Aug 28, 2014

I'm complete newbie here on GitHub. (New account 10 min ago.) However, I've got a bit of M code, I'd like to share. Related to accessing REST services to get commuting data. Not clear that I want to externalize my code like you are describing here. Not clear if I should add this to your repository, or start a new one. Chris Webb had sent me your way.

@tycho01 sort of interested in where you want to take this repository. Are you focusing on a repository of general M (Power Query) code? Maybe broken out by types over time... Or do you want to focus strictly on the method to do the externalizations. And want to let others curate a broader repository of M code....

So this may be a stupid set of questions. As I said. I'm completely new to GitHub and I do not know the culture here... Let me know what you think.

@KiaraGrouwstra
Copy link
Owner

@tgb417: Hey. No worries, I'm relatively unexperienced with Github myself. Since just the Load function is pretty short, I don't mind adding other general functions either, like those I added already. (If anyone else would like to host their own repository though, I'd gladly link to it on here, since at this point any growth in public M libraries would be great.) If you're not yet familiar with pull requests, if you'd rather paste here perhaps I could help as well?

The functions already in this repository I named with a naming convention along the lines of the default functions, though the periods cannot be kept intact inside query names (in case you'd need to add one of the functions to a workbook instead of dynamically loading it from the text file).

@tgb417
Copy link

tgb417 commented Aug 29, 2014

What editor do you use to externally edit an pretty up and comment M Code.
I've seen some discussion about using Notepad++. However, I've not gotten
it to work just right.

Once I've commented the code and cleaned it up a bit will send you a file.
However, I'd like to try a "pull request" in order to learn something more
about GIThub.

--Tom

On Fri, Aug 29, 2014 at 2:38 AM, tycho01 [email protected] wrote:

@tgb417 https://github.com/tgb417: Hey. No worries, I'm relatively
unexperienced with Github myself. Since just the Load function is pretty
short, I don't mind adding other general functions either, like those I
added already. (If anyone else would like to host their own repository
though, I'd gladly link to it on here, since at this point any growth in
public M libraries would be great.) If you're not yet familiar with pull
requests https://help.github.com/articles/creating-a-pull-request, if
you'd rather paste here perhaps I could help as well?

The functions already in this repository I named with a naming convention
along the lines of the default functions, though the periods cannot be kept
intact inside query names (in case you'd need to add one of the functions
to a workbook instead of dynamically loading it from the text file).


Reply to this email directly or view it on GitHub
#1 (comment).

@KiaraGrouwstra
Copy link
Owner

KiaraGrouwstra commented Aug 29, 2014

Yeah, Notepad++ is what I use. It doesn't have syntax highlighting specific to M, though highlighting still works okay if you set it to some other language like C. Hope you can get it to work (I think they also had a 'portable' version not requiring installation?). Tell me if I can be of help. :)

Edit @tgb417: correction, it does.
Also: Sublime Text.

@tgb417
Copy link

tgb417 commented Aug 29, 2014

K

--Tom

On Fri, Aug 29, 2014 at 6:18 AM, tycho01 [email protected] wrote:

Yeah, Notepad++ is what I use. It doesn't have syntax highlighting
specific to M, though highlighting still works okay if you set it to some
other language like C. Hope you can get it to work (I think they also had a
'portable' version not requiring installation?). Tell me if I can be of
help. :)


Reply to this email directly or view it on GitHub
#1 (comment).

@IvanBond
Copy link
Collaborator

Great! Found this when started to "expand all". Looking forward for new functions and will try to share own ideas.

@jaykilleen
Copy link

@tycho01 this is a very interesting repository. You have probably already tried this but thought I would ask. Do you know if it is possible to Text.FromURL( and then we can could Javascript like library import functionality?

This may also mean that we can create PowerBI files that don't break when the connection to the LoadPath is lost.

I think if we can build on this then we can have true functionality like you would see in RubyGems or npm (node package manager).

@jaykilleen
Copy link

jaykilleen commented May 10, 2016

For example, try wrapping Expression.Evaluate around the Text.FromBinary function below.

Text.FromBinary(Web.Contents("https://gist.githubusercontent.com/jaykilleen/4dd7ae430849fac3bec6634b65d1c759/raw/f20d288e94200aaaf3e5d78346f6a813c6cea6a2/Text.Count.m"))

@jaykilleen
Copy link

jaykilleen commented May 10, 2016

Hmm so with:

LoadURLPath.m

https://gist.githubusercontent.com/jaykilleen/4dd7ae430849fac3bec6634b65d1c759/raw/738cb116a9c076c48da08d6e143543ccd05a84bb/Text.Count.m

LoadURL.m

(fnName as text, optional BasePath as text) as function =>
let
    Source = Text.FromBinary(
      Binary.Buffer(Web.Contents(LoadURLPath)
    )),
    Function = try Expression.Evaluate(Text.Replace(fnName, ".", "_"), #shared)  //if already imported into the workbook just use the existing one
        otherwise Expression.Evaluate(Source, #shared)  //if not imported yet try loading it from the text file in the folder
in
    Function

TextCountResult.m

let
    Text.Count = LoadURL("Text.Count")
in
    Text.Count("Abba", "b")

Returns 2.

Very interesting. Is there any way we can load all the functions from one file instead of split across many? Not sure if I have seen that functionality yet in M.

@KiaraGrouwstra
Copy link
Owner

Hi @jaykilleen. Thanks for clarifying. That looks sensible.

This is also what @IvanBond has been doing in his fork (some discussion here). I'd noted there I feared it'd just keep hammering servers, as I'd had bad experience trying to use Power Query for scraping.

The use case seems valid enough though -- feel free to submit a PR. Comments:

  • preferably leave that LoadURLPath gitignore'd for similar reasons as mentioned in that thread mentioned above.
  • ditch BasePath parameter since unused
  • You may want to use a 'base url' that could generalize to multiple queries in the way the offline one did -- say https://raw.githubusercontent.com/tycho01/pquery/master/ here.

Alternatively, one could imagine having a file mapping function names to different URLs -- not unlike a package manager, as you mentioned. In fact, let's say we had a list like that on Git or whatever as well, that way others might be able to add their own as well. This way, you could imagine having one of your offline files list different such remote repositories to check for function availability (aside from what was available locally already).

I think I need to point out a potentially pain point here -- M is functional, in the Haskell sense, but even more so: it does not do side-effects.
What this means is, it does not have the capacity to store remote resources offline, and it's essentially going to bomb the server with requests every time the script runs, if not on every invocation. This doesn't mean it can't be done, but just makes it more unfortunate.

I've thought about ways that these limitations in terms of side-effects could be overcome, such as, for example, making a local server that could execute commands based on requests sent to it by the M instance. This way, the M instance could tell this local server to save your remote script locally, and would from then just try to access your file from this local source instead.

That could be done. But it'd complicate things, and raises further questions w.r.t. how this might generalize to the Power BI use-case. I haven't actually used M from an online Power BI instance myself, which may account for my myopia in that direction.

To expand though, even then, that wouldn't solve all the inherent issues. The thing is, the reason extending Power Query to enable side-effects would suck is, it wasn't meant for it. The reason there are few languages properly combining functional laziness (as M has) with side-effects is it gets complex -- the only one I know of is Haskell. Yet that bump in learning curve simultaneously tends to scare people off.

And it makes sense there aren't many languages as pure as M... because without any way to achieve side effects, it quickly gets really hard to achieve anything remotely useful, such as doing a proper package manager.
I may sound bitter here, but the unfortunate truth might be M is just a toy language with some decent use-cases in this niche of transformations. And I like its consistent API, but it's a pain to write, extend, or achieve things outside of its original scope with.
Since my frustration in my original scraping-related endeavors I switched to using ES6 (gets the good functional bits of M thru lodash/ramda) + Elixir. I also wish things were better.

@KiaraGrouwstra
Copy link
Owner

@jaykilleen: Actually, I think we're getting somewhere.
You said you wanted a package manager like npm.
I said you cannot write one in M.

I think we were getting it backward. The question isn't, how can we start from M and make it like npm, but rather, if we'd need to have the package manager be in a language other than M anyway, if we'd start from npm, how much different would we need to make it?

For the purpose of M code, it wouldn't really matter if loaded code resides in a folder named node_modules or something else. So for that matter, I believe npm itself might actually suffice for our purposes. My main concern is that they'd probably consider the repository itself to be meant for storing JavaScript-related packages. If we could say use the package manager but swap out the repository, we just might be good already.

I guess another question is how this might relate with Load context (will the main script and its dependencies have the same Load/LoadPath?) with regard to deep vs. flat package hierarchy (npm at one point switched to the latter).

@KiaraGrouwstra
Copy link
Owner

@jaykilleen: thinking about it further, that last concern about Load context shouldn't be too bad; I think in JS require just tries to resolve the path from the curent directory as well as any parent directories, and I suppose that'd address that here as well.

But as mentioned I'm not really using M anymore, so my needs are no longer very relevant. So you guys had often brought up the cloud Power BI, so I guess that might be the primary question now. If that environment is so constrained, an npm-ish package manager is probably not within the scope of what it'd allow you to do either (note: I haven't looked into this).

So then you'd be back at clunky hacks fetching scripts every time, like your LoadURL.

@jaykilleen
Copy link

@tycho01 Thanks for all the insight. A lot to digest. I think you raise an important point that it would be poor design of the queries to be hitting a server each time they are called.

Ideally, we would use VBA to copy the files locally so then the requests don't hit the server. Unfortunately, Power BI Desktop doesn't have any form of VBA implemented within it (that I am aware of).

I am spending approx. 50/50 time between Power Query and PowerBI.

Alternatively, investigate whether the call to Web.Contents is being cached, hmm, it's a tricky one. At this stage it might just be easier to use VBA metaprogramming to write the M queries using the request to the Web.Contents. You could also then, more easily, iterate over the contents of the Github repository to collect all the information, or write it all to one file and parse the entire file using VBA.

@jaykilleen
Copy link

@tycho01 I have an example over at this repo xlWeeties where I have a module created based on content of the repo stored on Github. It isn't very elegant but I got it to work and that was enough to get me excited.

@KiaraGrouwstra
Copy link
Owner

@jaykilleen: as to testing when it makes network calls, during my scraping attempts earlier I used tools like Wireshark or Fiddler2 to inspect requests.

I like the VBA on Git idea. At the time I'd tried to see if I could figure out a way to automate conversion between workbooks and contained VBA/M for Git purposes, but that turned out pretty inviable (while VBA had been extracted before, afaik M parts have not).

I did a bunch of VBA at the time, but mostly under contract, so I wasn't really in a position to open-source :(. That being said, more modern languages feel like a blessing now...

For VBA to add queries, in case you hadn't yet, you might wanna check the workbook I'd linked in the readme. By combining it with your 'grab code from url' thing it could work out to grab remote M functions.

KiaraGrouwstra pushed a commit that referenced this issue Oct 17, 2016
Remove errors from list (not column)
KiaraGrouwstra pushed a commit that referenced this issue Nov 28, 2016
@rdiazjimenez
Copy link

Super useful. Thanks for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants